Responsive image
Universidade Federal de Santa catarina (UFSC)
Programa de Pós-graduação em Engenharia, Gestão e Mídia do Conhecimento (PPGEGC)
Detalhes do Documento Analisado

Centro: Não Informado

Departamento: Não Informado

Dimensão Institucional: Pós-Graduação

Dimensão ODS: Econômica

Tipo do Documento: Dissertação

Título: BINLPT: A WORKLOAD-AWARE PARALLEL LOOP SCHEDULER FOR LARGE-SCALE MULTICORE PLATFORMS

Orientador
  • MARCIO BASTOS CASTRO
Aluno
  • PEDRO HENRIQUE DE MELLO MORADO PENNA

Conteúdo

The high performance computing community seeks for efficient and scalable solutions to meet the ever-increasing performance demands in large-scale applications. to achieve this goal, intricacies of the target application and platform are often exploited, so that specific techniques can be applied. in this context, the irregularity of the application is an important characteristic that should be considered. for instance, when scheduling loop iterations of a shared-memory-based applications, workload-aware scheduling strategies stands out as the most promising approach. unfortunately, existing strategies that are based on this finding present several drawbacks that should be overcome. first, these strategies rely on profiling and statistical regression techniques, and thus are inherently designed to well-behaved workloads. second, workload-aware strategies fail to apply their knowledge about the underlying workload of the target irregular loop when scheduling chunks of iterations. third, existing strategies were not so far comprehensively evaluated, in what concerns variations in the workload. finally, despite the existence of several workload-aware strategies, none of them is integrated in a publicly available library for parallel programming, hence making it harder for applications to effectively get benefited from them. to address these challenges, in this work we propose a novel workload-aware loop scheduling strategy called binlpt. to enable superior performance and flexibility, our strategy is based on some features as user-supplied estimation about the workload of the target irregular loop and the use of a adaptive scheduling heuristic based on the longest processing time (lpt) rule. we integrated binlpt into openmp, and we made our implementation publicly available. to conceive binlpt, we relied on two cornerstones, both devised during the preparation of this master thesis: a simulation-guided design methodology and on a proof-of-concept workload-aware loop scheduler, named smart round-robin (srr). we carried out a throughout assessment of binlpt using simulations, synthetic kernels and application kernels. we ran experiments on a large-scale numa machine and we studied the different workloads. in the application kernels, our experimental results uncovered up to 64.91% of performance improvement when using binlpt, in contrast to openmp strategies.

Índice de Shannon: 3.93014

Índice de Gini: 0.930835

ODS 1 ODS 2 ODS 3 ODS 4 ODS 5 ODS 6 ODS 7 ODS 8 ODS 9 ODS 10 ODS 11 ODS 12 ODS 13 ODS 14 ODS 15 ODS 16
3,48% 4,99% 5,91% 7,24% 5,13% 5,66% 5,87% 7,72% 11,94% 4,56% 9,63% 5,41% 4,23% 6,59% 5,36% 6,27%
ODS Predominates
ODS 9
ODS 1

3,48%

ODS 2

4,99%

ODS 3

5,91%

ODS 4

7,24%

ODS 5

5,13%

ODS 6

5,66%

ODS 7

5,87%

ODS 8

7,72%

ODS 9

11,94%

ODS 10

4,56%

ODS 11

9,63%

ODS 12

5,41%

ODS 13

4,23%

ODS 14

6,59%

ODS 15

5,36%

ODS 16

6,27%