One paper accepted to Journal of System Architecture (Journal, SCIE)

One paper was accepted to Journal of System Architecture. Our paper presents an efficient prefetching approach for Transformer workload on GPU’s unified virtual memory (UVM). Our analysis results reveal the default tree-based neighborhood prefetcher in UVM cannot handle the data locality over one virtual address block (VABlock). By extending the range of prefetching to multiple VABlocks, our proposed approach can improve the performance of Transformer workload when GPU memory is oversubscribed.