【平行運算】OpenMP教學(二) 分工 Worksharing

多執行緒平行計算 OpenMP教學 Part1 : 基礎語法 (附完整程式碼) 中,我們只是將parallel region裡面的程式碼執行了很多次 : 我們開了很多執行緒,每個執行緒都將裡面的函式(printf)做了一次,其實並沒有真正作到分工。在OpenMP中,要將計算量分工給多個執行緒處理,可以用到 worksharing constructs : for construct, sections construct, single construct。以下我們將使用這些 worksharing constructs 來達到真正的分工、真正的平行化處理,提昇效能

完整程式碼

OpenMP 語法

  • # pragma omp for
  • # pragma omp sections
  • # pragma omp single

OpenMP 範例程式 : For loop 平行化

  1. # pragma omp for : 將 for loop 平行化,要注意的是,在以下範例中,每個 array 裡面的元素不應該和其他元素有關係,每個計算要是獨立的( 計算順序不重要),這樣才能讓 for loop 平行不出錯
  2. # pragma omp for 必須在 parallel region 裡面使用 (必須在 #pragma omp parallel的大括號裡面),其他的 worksharing construct 也是,需在 parallel region 裡面
# === complile 編譯 ===
$ g++ -fopenmp example_worksharing_1.cpp -o example_worksharing_1.out
// ** 檔名 example_worksharing_1.cpp **
// 都會阿嬤 OpenMP 教學
// 都會阿嬤 https://weikaiwei.com

#include <stdio.h>
#include <omp.h>

int main()
{
    const int N = 8;
    int A[N] = {1, 2, 3, 4, 5, 6, 7, 8};
#pragma omp parallel
    {

        const int thread_id = omp_get_thread_num();

// parallel computing for A[i] = A[i] + 1
#pragma omp for
        for (int i = 0; i < N; i++)
        {
            A[i] = A[i] + 1;
            printf("A[%d] is computed by thread number : %d\n", i, thread_id);
        }
    }

    // print the array after parallel computing
    for (int i = 0; i < N; i++)
    {
        printf("%d, ", A[i]);
    }
    printf("\n");
    return 0;
}
# === 執行 execute===
$ ./example_worksharing_1.out

# === 輸出 output===
A[6] is computed by thread number : 3
A[7] is computed by thread number : 3
A[4] is computed by thread number : 2
A[5] is computed by thread number : 2
A[0] is computed by thread number : 0
A[1] is computed by thread number : 0
A[2] is computed by thread number : 1
A[3] is computed by thread number : 1
2, 3, 4, 5, 6, 7, 8, 9, 

OpenMP 範例程式 : Sections 平行化

  1. # pragma omp sections : 包含數個 section (沒有”s”),手動將程式碼拆成很多個 section
  2. 每個 section 只由1個執行緒執行
# === complile 編譯 ===
$ g++ -fopenmp example_worksharing_2.cpp -o example_worksharing_2.out
// ** 檔名 example_worksharing_2.cpp **
// 都會阿嬤 OpenMP 教學
// 都會阿嬤 https://weikaiwei.com

#include <stdio.h>
#include <omp.h>

int main()
{
    const int N = 8;
    int A[N] = {1, 2, 3, 4, 5, 6, 7, 8};
#pragma omp parallel
    {
        const int thread_id = omp_get_thread_num();

// parallel computing for A[i] = A[i] + 1
#pragma omp sections
        {
// section 1
#pragma omp section
            {
                for (int i = 0; i < N / 2; i++)
                {
                    A[i] = A[i] + 1;
                    printf("A[%d] is computed by thread number : %d\n", i, thread_id);
                }
            }

// section 2
#pragma omp section
            {
                for (int i = N / 2; i < N; i++)
                {
                    A[i] = A[i] + 1;
                    printf("A[%d] is computed by thread number : %d\n", i, thread_id);
                }
            }
        }
    }

    // print the array after parallel computing
    for (int i = 0; i < N; i++)
    {
        printf("%d, ", A[i]);
    }
    printf("\n");
    return 0;
}
# === 執行 execute===
$ ./example_worksharing_2.out

# === 輸出 output===
A[0] is computed by thread number : 3
A[1] is computed by thread number : 3
A[2] is computed by thread number : 3
A[3] is computed by thread number : 3
A[4] is computed by thread number : 0
A[5] is computed by thread number : 0
A[6] is computed by thread number : 0
A[7] is computed by thread number : 0
2, 3, 4, 5, 6, 7, 8, 9, 

OpenMP 範例程式 : single 單一執行

  1. # pragma omp single : 在 parallel region 裡面,原來程式碼會被所有的執行緒執行,但有 single construct,則只會被一個執行緒執行
# === complile 編譯 ===
$ g++ -fopenmp example_worksharing_3.cpp -o example_worksharing_3.out
// ** 檔名 example_worksharing_3.cpp **
// 都會阿嬤 OpenMP 教學
// 都會阿嬤 https://weikaiwei.com

#include <stdio.h>
#include <omp.h>

int main()
{

#pragma omp parallel
    {
        const int thread_id = omp_get_thread_num();

        printf("**Outside** single section, I am thread number %d\n", thread_id);

#pragma omp single
        {
            printf("!!Inside!!  single section, I am thread number %d\n", thread_id);
        }
    }
    return 0;
}
# === 執行 execute===
$ ./example_worksharing_3.out

# === 輸出 output===
**Outside** single section, I am thread number 0
!!Inside!!  single section, I am thread number 0
**Outside** single section, I am thread number 1
**Outside** single section, I am thread number 3
**Outside** single section, I am thread number 2

留言討論區