【平行運算】OpenMP教學(二) 分工 Worksharing

觀看次數： 15,531

在多執行緒平行計算 OpenMP教學 Part1 : 基礎語法 (附完整程式碼) 中，我們只是將parallel region裡面的程式碼執行了很多次 : 我們開了很多執行緒，每個執行緒都將裡面的函式(printf)做了一次，其實並沒有真正作到分工。在OpenMP中，要將計算量分工給多個執行緒處理，可以用到 worksharing constructs : for construct, sections construct, single construct。以下我們將使用這些 worksharing constructs 來達到真正的分工、真正的平行化處理，提昇效能

完整程式碼

GitHub 完整程式碼連結 : https://github.com/grandma-tutorial/OpenMP-tutorial

OpenMP 語法

# pragma omp for
# pragma omp sections
# pragma omp single

OpenMP 範例程式 : For loop 平行化

# pragma omp for : 將 for loop 平行化，要注意的是，在以下範例中，每個 array 裡面的元素不應該和其他元素有關係，每個計算要是獨立的( 計算順序不重要)，這樣才能讓 for loop 平行不出錯
# pragma omp for 必須在 parallel region 裡面使用 (必須在 #pragma omp parallel的大括號裡面)，其他的 worksharing construct 也是，需在 parallel region 裡面

# === complile 編譯 ===
$ g++ -fopenmp example_worksharing_1.cpp -o example_worksharing_1.out

// ** 檔名 example_worksharing_1.cpp **
// 都會阿嬤 OpenMP 教學
// 都會阿嬤 https://weikaiwei.com

#include <stdio.h>
#include <omp.h>

int main()
{
    const int N = 8;
    int A[N] = {1, 2, 3, 4, 5, 6, 7, 8};
#pragma omp parallel
    {

        const int thread_id = omp_get_thread_num();

// parallel computing for A[i] = A[i] + 1
#pragma omp for
        for (int i = 0; i < N; i++)
        {
            A[i] = A[i] + 1;
            printf("A[%d] is computed by thread number : %d\n", i, thread_id);
        }
    }

    // print the array after parallel computing
    for (int i = 0; i < N; i++)
    {
        printf("%d, ", A[i]);
    }
    printf("\n");
    return 0;
}

# === 執行 execute===
$ ./example_worksharing_1.out

# === 輸出 output===
A[6] is computed by thread number : 3
A[7] is computed by thread number : 3
A[4] is computed by thread number : 2
A[5] is computed by thread number : 2
A[0] is computed by thread number : 0
A[1] is computed by thread number : 0
A[2] is computed by thread number : 1
A[3] is computed by thread number : 1
2, 3, 4, 5, 6, 7, 8, 9,

OpenMP 範例程式 : Sections 平行化

# pragma omp sections : 包含數個 section (沒有”s”)，手動將程式碼拆成很多個 section
每個 section 只由1個執行緒執行

# === complile 編譯 ===
$ g++ -fopenmp example_worksharing_2.cpp -o example_worksharing_2.out

// ** 檔名 example_worksharing_2.cpp **
// 都會阿嬤 OpenMP 教學
// 都會阿嬤 https://weikaiwei.com

#include <stdio.h>
#include <omp.h>

int main()
{
    const int N = 8;
    int A[N] = {1, 2, 3, 4, 5, 6, 7, 8};
#pragma omp parallel
    {
        const int thread_id = omp_get_thread_num();

// parallel computing for A[i] = A[i] + 1
#pragma omp sections
        {
// section 1
#pragma omp section
            {
                for (int i = 0; i < N / 2; i++)
                {
                    A[i] = A[i] + 1;
                    printf("A[%d] is computed by thread number : %d\n", i, thread_id);
                }
            }

// section 2
#pragma omp section
            {
                for (int i = N / 2; i < N; i++)
                {
                    A[i] = A[i] + 1;
                    printf("A[%d] is computed by thread number : %d\n", i, thread_id);
                }
            }
        }
    }

    // print the array after parallel computing
    for (int i = 0; i < N; i++)
    {
        printf("%d, ", A[i]);
    }
    printf("\n");
    return 0;
}

# === 執行 execute===
$ ./example_worksharing_2.out

# === 輸出 output===
A[0] is computed by thread number : 3
A[1] is computed by thread number : 3
A[2] is computed by thread number : 3
A[3] is computed by thread number : 3
A[4] is computed by thread number : 0
A[5] is computed by thread number : 0
A[6] is computed by thread number : 0
A[7] is computed by thread number : 0
2, 3, 4, 5, 6, 7, 8, 9,

OpenMP 範例程式 : single 單一執行

# pragma omp single : 在 parallel region 裡面，原來程式碼會被所有的執行緒執行，但有 single construct，則只會被一個執行緒執行

# === complile 編譯 ===
$ g++ -fopenmp example_worksharing_3.cpp -o example_worksharing_3.out

// ** 檔名 example_worksharing_3.cpp **
// 都會阿嬤 OpenMP 教學
// 都會阿嬤 https://weikaiwei.com

#include <stdio.h>
#include <omp.h>

int main()
{

#pragma omp parallel
    {
        const int thread_id = omp_get_thread_num();

        printf("**Outside** single section, I am thread number %d\n", thread_id);

#pragma omp single
        {
            printf("!!Inside!!  single section, I am thread number %d\n", thread_id);
        }
    }
    return 0;
}

# === 執行 execute===
$ ./example_worksharing_3.out

# === 輸出 output===
**Outside** single section, I am thread number 0
!!Inside!!  single section, I am thread number 0
**Outside** single section, I am thread number 1
**Outside** single section, I am thread number 3
**Outside** single section, I am thread number 2