One paper accepted to IEEE Embedded Systems Letters (Journal, SCIE)
One paper was accepted to IEEE Embedded Systems Letters. Our paper presents a software approach that can optimize thread allocation for merged and fused kernels in multi-tenant inference tasks in embedded GPU systems. Our proposed apporach identifies the best-performing thread counts based on the performance modeling to improve hardware utiliztion while ensuring QoS compilance.