Computer System Architecture Lab
Computer System Architecture Lab
Home
News
Members
Publications
Research
Gallery
Contact
Light
Dark
Automatic
2
LogFlex: Flexible-Bit Log Arithmetic Accelerator for Language Models on Edge
Deploying language models (LMs) on resource-constrained mobile/wearable devices while maintaining the output quality is challenging. To …
Yujin Kim
,
Faraz Tahmasebi
,
Gunjae Koo
,
Hyoukjun Kwon
PDF
Cite
Project
DOI
TM-Training: An Energy-Efficient Tiered Memory System for Deep Learning Training in NPUs
DRAM accounts for a large fraction of the total cost of ownership of memory systems in deep learning acceleration systems. To achieve …
Jaeyong Park
,
Sangun Choi
,
Jongmin Kim
,
Gunjae Koo
,
Myung Kuk Yoon
,
Yunho Oh
PDF
Cite
Project
DOI
MOST: Memory Oversubscription-Aware Scheduling for Tensor Migration on GPU Unified Storage
Deep Neural Network (DNN) training demands large memory capacities that exceed the limits of current GPU onboard memory. Expanding GPU …
Junsu Kim
,
Jaebeom Jeon
,
Jaeyong Park
,
Sangun Choi
,
Minseong Gil
,
Seokin Hong
,
Gunjae Koo
,
Myung Kuk Yoon
,
Yunho Oh
PDF
Cite
Project
Project
DOI
Beyond VABlock: Improving Transformer Workloads through Aggressive Prefetching
The memory capacity constraint of GPUs is a major challenge in running large deep learning workloads with their ever increasing memory …
Jane Rhee
,
Ikyoung Choi
,
Gunjae Koo
,
Yunho Oh
,
Myung Kuk Yoon
PDF
Cite
Project
Project
DOI
TLP Balancer: Predictive Thread Allocation for Multitenant Inference in Embedded GPUs
This letter introduces a novel software technique to optimize thread allocation for merged and fused kernels in multitenant inference …
Minseong Gil
,
Jaebeom Jeon
,
Junsu Kim
,
Sangun Choi
,
Gunjae Koo
,
Myung Kuk Yoon
,
Yunho Oh
PDF
Cite
Project
DOI
SAVector: Vectored Systolic Arrays
Conventional DNN inference accelerators are designed with a few (up to four) large systolic arrays. As such a scale-up architecture …
Sangun Choi
,
Seongjun Park
,
Jaeyong Park
,
Jongmin Kim
,
Gunjae Koo
,
Seokin Hong
,
Myung Kuk Yoon
,
Yunho Oh
PDF
Cite
Project
DOI
Conflict-Aware Compiler for Hierarchical Register File on GPUs
Modern graphics processing units (GPUs) leverage a high degree of thread-level parallelism, necessitating large-sized register files …
Eunbi Jeong
,
Eun Seong Park
,
Gunjae Koo
,
Yunho Oh
,
Myung Kuk Yoon
PDF
Cite
Project
DOI
Adaptive Kernel Merge and Fusion for Multi-Tenant Inference in Embedded GPUs
This paper proposes a new scheme that improves throughput and reduces queuing delay while running multiple inferences in embedded …
Jaebeom Jeon
,
Gunjae Koo
,
Myung Kuk Yoon
,
Yunho Oh
PDF
Cite
Project
DOI
Vizard: Passing over Profiling-Based Detection by Manipulating Performance Counters
Cache side-channel attacks have been serious security threats to server computer systems, thus researchers have proposed software-based …
Minkyu Song
,
Taeweon Suh
,
Gunjae Koo
PDF
Cite
Project
DOI
Analyzing GCN Aggregation on GPU
Graph convolutional neural networks (GCNs) are emerging neural networks for graph structures that include large features associated …
Inje Kim
,
Jonghyun Jeong
,
Yunho Oh
,
Myung Kuk Yoon
,
Gunjae Koo
PDF
Cite
Project
Project
DOI
»
Cite
×