tag:blogger.com,1999:blog-2420869991974935734.post270354894803779155..comments2023-07-02T17:04:56.386+02:00Comments on Krister Walfridsson’s old blog: Building GCC with support for NVIDIA PTX offloadingKrister Walfridssonhttp://www.blogger.com/profile/02591279630933941271noreply@blogger.comBlogger25125tag:blogger.com,1999:blog-2420869991974935734.post-56823687648579287372021-05-01T23:38:46.654+02:002021-05-01T23:38:46.654+02:00Sorry for not answering this before -- my comment ...Sorry for not answering this before -- my comment notification was broken :(<br />Answering now as this is a common problem.<br /><br />The most common reason for the code not being run on the GPU is that the path to $install_dir/lib64 has not been added to LD_LIBRARY_PATH.<br /><br />It is possible to verify that the code is executed on the GPU by setting the GOMP_DEBUG environment variable to 1 -- this should print lots of information about the kernel and how it is executed on the GPU.Krister Walfridssonhttps://www.blogger.com/profile/02591279630933941271noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-78521326851484830312020-04-26T00:37:17.109+02:002020-04-26T00:37:17.109+02:00In Ubuntu 18.04, using the exact script I was gett...In Ubuntu 18.04, using the exact script I was getting errors while building Nvptx GCC and Host GCC. I tryed to tweak it a little by changing the GCC version according to the CUDA version for example, but still got errors like:<br /><br />Makefile:380: recipe for target 'lib_a-locale.o' failed<br />or<br />nvptx-as: ptxas returned 255 exit status<br />or<br />ptxas lib_a-locale.o, line 37; fatal : Invalid initial value expression<br />or<br />configure: error: cannot compute suffix of object files: cannot compile<br />or<br />gcc-7.3.0: ptxas lib_a-hash_func.o, line 11; fatal : Invalid initial value expression<br />or<br />x86_64-pc-linux-gnu-accel-nvptx-none-gcc: error: libgomp.spec: No such file or directory<br /><br />So.. To anyone in that situation, this might be useful.<br />I managed to make it work by:<br />1) Changing the repository of nvptx-newlib, as the one in the script is obsolete (according to the description of the repository itself). So I used the latest newlib, which now contains the nvptx in it: <br />git clone git://sourceware.org/git/newlib-cygwin.git<br /><br />2) Changing the repository of GCC to the trunk version (apparently in 12/2017 the trunk version was the problematic one... Maybe this is a cyclic thing? Depending on when you're reading this comment, try changing the GCC version, might help). In my case:<br />git clone https://github.com/gcc-mirror/gcc<br /><br />I hope it helps someone :)Vinícius Pachecohttps://www.blogger.com/profile/00757280714999765465noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-65944673026214963552018-10-18T22:52:31.546+02:002018-10-18T22:52:31.546+02:00Hey, when compiling on current manjaro I get:
fata...Hey, when compiling on current manjaro I get:<br />fatal error: sys/ustat.h: No such file or directory<br />During host compiler compilation. I read that it's a lib that got removed. Do you know how to get around that?Coderhttps://www.blogger.com/profile/00662178676795485451noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-26460878737808604292018-04-17T16:36:10.816+02:002018-04-17T16:36:10.816+02:00Additional information:
Result of "acc_get_...Additional information: <br /><br />Result of "acc_get_num_devices(acc_device_nvidia)" is 0. Why do you think this is happening? <br /><br />Result of "offload/install/bin/gcc -v": <br /><br />Using built-in specs.<br />COLLECT_GCC=/offload/install/bin/gcc<br />COLLECT_LTO_WRAPPER=/offload/install/libexec/gcc/x86_64-pc-linux-gnu/7.2.0/lto-wrapper<br />OFFLOAD_TARGET_NAMES=nvptx-none<br />Target: x86_64-pc-linux-gnu<br />Configured with: ../gcc/configure --enable-offload-targets=nvptx-none --with-cuda-driver-include=/usr/local/cuda/include --with-cuda-driver-lib=/usr/local/cuda/lib64 --disable-bootstrap --disable-multilib --enable-languages=c,c++,fortran,lto --prefix=/offload/install<br />Thread model: posix<br />gcc version 7.2.0 (GCC)Anonymoushttps://www.blogger.com/profile/04043116755108590725noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-17020693425786820092018-04-17T16:16:03.819+02:002018-04-17T16:16:03.819+02:00Hi,
i managed to follow the steps you indicated a...Hi,<br /><br />i managed to follow the steps you indicated and install gcc with offloading support. Now i made a simple script to check if everything is working, the script looks like this:<br /><br />#pragma acc parallel loop<br /> for (int j = 0; j < 10; j++) {<br /> x[j] = j;<br /> y[j] = -j;<br /> }<br /><br />I can compile with /offload/install/bin/g++ -O3 -fopenacc test.cpp and run the executable. But then i run the code with a profiler from pgi to check if GPU is being used, but it not. How can i confirm that openacc is parallelizing the code?Anonymoushttps://www.blogger.com/profile/04043116755108590725noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-86189928326870411922018-04-02T11:11:38.058+02:002018-04-02T11:11:38.058+02:00No, I don't have any good idea... :(No, I don't have any good idea... :(Krister Walfridssonhttps://www.blogger.com/profile/02591279630933941271noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-40570864694945708172018-03-28T00:53:48.043+02:002018-03-28T00:53:48.043+02:00Hi,
It is very useful your script. Thank you.
I...Hi,<br /><br />It is very useful your script. Thank you. <br /><br />I managed to compile my code with: $install_dir/bin/gcc -O3 -fopenmp -foffload=nvptx-none -foffload=-lm main.c.<br /><br />But, when I run the executable, I receive the next error:<br /><br />libgomp: cuCtxSynchronize error: the launch timed out and was terminated<br />libgomp: cuMemFreeHost error: the launch timed out and was terminated<br />libgomp: device finalization failed<br /><br />Do you know what could be wrong?<br /><br />Thank you.<br /><br />Angelicahttps://www.blogger.com/profile/18272953651225804494noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-86543930896840042552018-03-13T21:25:18.635+01:002018-03-13T21:25:18.635+01:00Try updating binutils.Try updating binutils.MDhttps://www.blogger.com/profile/17025732831924692076noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-35096365739465972462017-12-29T05:56:50.791+01:002017-12-29T05:56:50.791+01:00Now it's working with this updated script.Now it's working with this updated script.Anonymoushttps://www.blogger.com/profile/14618005321735792820noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-11116364635544837742017-12-27T13:33:47.745+01:002017-12-27T13:33:47.745+01:00now works perfectly, thank younow works perfectly, thank youZanathoshttps://www.blogger.com/profile/11095089819897917843noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-53074919827294698482017-12-26T21:41:10.179+01:002017-12-26T21:41:10.179+01:00This means that it is using your system's libg...This means that it is using your system's libgomp instead of the newly built library. Add the path to the newly built library (typically \(\verb!lib64!\) in your \(\verb!$install_path!\)) to \(\verb!LD_LIBRARY_PATH!\) to make it use the correct version.Krister Walfridssonhttps://www.blogger.com/profile/02591279630933941271noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-71554813452884579992017-12-26T21:13:27.211+01:002017-12-26T21:13:27.211+01:00hi, after the changes to the script i obtain this ...hi, after the changes to the script i obtain this error when i try to launch the executable created.<br /><br />libgomp: Library too old for offload (version 0 < 1)<br /><br />i compile with this command<br /><br />g++ -std=c++11 -O3 -fopenmp -DOPENMP -foffload=nvptx-none main.cpp <br />-o main<br /><br />and i compile with no error. thank you for any help.<br />Zanathoshttps://www.blogger.com/profile/11095089819897917843noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-90353568225748551462017-12-23T17:53:19.411+01:002017-12-23T17:53:19.411+01:00Thank you so much!!! I'll be eagerly waiting f...Thank you so much!!! I'll be eagerly waiting for your fix.Anonymoushttps://www.blogger.com/profile/14618005321735792820noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-8027244745298194782017-12-23T01:32:30.727+01:002017-12-23T01:32:30.727+01:00The problem seems to only occur on recent trunk ve...The problem seems to only occur on recent trunk versions, so I have now updated the script to build GCC 7.2 instead.Krister Walfridssonhttps://www.blogger.com/profile/02591279630933941271noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-81298743459474195492017-12-23T00:23:30.268+01:002017-12-23T00:23:30.268+01:00I can reproduce this now, even though I do not und...I can reproduce this now, even though I do not understand why one of my build trees work fine and one fails... I'll investigate this, but I will be busy with Christmas-related things the coming days, so I do not expect to have any solution until the end of next week... :(Krister Walfridssonhttps://www.blogger.com/profile/02591279630933941271noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-51639697533964911692017-12-22T21:48:14.482+01:002017-12-22T21:48:14.482+01:00I'm facing the same issue. Any help would be g...I'm facing the same issue. Any help would be great.Anonymoushttps://www.blogger.com/profile/14618005321735792820noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-50970803597985158372017-11-13T00:41:40.966+01:002017-11-13T00:41:40.966+01:00I do not have any good idea what may be wrong... \...I do not have any good idea what may be wrong... \(\verb!libgomp.spec!\) is supposed to be present in the same directory as \(\verb!libgomp.so!\) (i.e. \(\verb!$install_dir/lib64!\)).<br /><br />You can see where GCC tries to find it if you compile using \(\verb!-v!\) – the search path is the one shown as \(\verb!LIBRARY_PATH!\) right before the error message.Krister Walfridssonhttps://www.blogger.com/profile/02591279630933941271noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-59800754094080486132017-11-08T20:24:46.519+01:002017-11-08T20:24:46.519+01:00hi, i'm trying to compile a simple example but...hi, i'm trying to compile a simple example but i obtain this error:<br />x86_64-pc-linux-gnu-accel-nvptx-none-gcc: error: libgomp.spec: No such file or directory.<br /><br />i insert the library path in the .profile file and when i try to use the compile i obtain the error i wrote above. thanks for any help.Zanathoshttps://www.blogger.com/profile/11095089819897917843noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-50456434769134095412017-09-10T03:11:45.667+02:002017-09-10T03:11:45.667+02:00I don't have any good idea what may be wrong.....I don't have any good idea what may be wrong...<br /><br />But something seems strange with your installation – it should not need -flto. I'll try to figure out how it differs from my installation if you mail me (krister.walfridsson at gmail dot com) the output of<br />offload/wrk/install/bin/gcc -O3 -fopenacc -foffload=nvptx-none -foffload=-lm vecadd.c -lm -vKrister Walfridssonhttps://www.blogger.com/profile/02591279630933941271noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-62643811556988323762017-09-08T21:17:24.371+02:002017-09-08T21:17:24.371+02:00Additional info:
When I compile without -flto fla...Additional info:<br /><br />When I compile without -flto flag, it doesn't compile and I get those errors.<br /><br />offload/wrk/install/bin/gcc -O3 -fopenacc -foffload=nvptx-none -foffload=-lm vecadd.c -lm<br /><br />gcc: warning: ‘-x lto’ after last input file has no effect<br />gcc: fatal error: no input files<br />compilation terminated.<br />lto-wrapper: fatal error: offload/wrk/install/bin/gcc returned 1 exit status<br />compilation terminated.<br />collect2: fatal error: lto-wrapper returned 1 exit status<br />compilation terminated.Anonymoushttps://www.blogger.com/profile/13635007778894013086noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-55328050086634778842017-09-08T21:10:03.781+02:002017-09-08T21:10:03.781+02:00I'm having problems to make it work. Can you h...I'm having problems to make it work. Can you help me? CUDA works on my machine. It is a CentOS, and I compiled with gcc 7.2.<br /><br />It compiles the code, but when I ran it, I get:<br /><br />libgomp: target function wasn't mapped<br /><br />Any ideas?<br /><br /><br /><br />Anonymoushttps://www.blogger.com/profile/13635007778894013086noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-41303433252650543812017-05-17T01:15:18.470+02:002017-05-17T01:15:18.470+02:00It will be taken as an "order" to use th...It will be taken as an "order" to use the GPU, and GCC does not try to handle overlap.Krister Walfridssonhttps://www.blogger.com/profile/02591279630933941271noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-12568358849348574512017-05-12T23:07:59.581+02:002017-05-12T23:07:59.581+02:00Will the #pragma be taken as an "order" ...Will the #pragma be taken as an "order" to use the GPU, or merely as an invitation to see if using the GPU would appear advantageous? If a piece of code would take 20us to run on the main CPU, or 5us on the main CPU plus 30us on the GPU, it would seem that using the GPU would be a win if the main CPU could overlap enough computation with the GPU that it would end up being idle for 15us or less waiting for the GPU to finish. Does gcc try to handle overlap, and if so, how?supercathttps://www.blogger.com/profile/12531904492602532373noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-16877383833375073132017-05-09T22:09:32.059+02:002017-05-09T22:09:32.059+02:00GCC does not try to schedule things on the GPU by ...GCC does not try to schedule things on the GPU by itself — you need to decorate the code using \(\verb!#pragma!\) to tell the compiler that you intend it to run on the GPU (and that it is safe to do it).Krister Walfridssonhttps://www.blogger.com/profile/02591279630933941271noreply@blogger.comtag:blogger.com,1999:blog-2420869991974935734.post-74830721388458616652017-04-26T17:53:46.520+02:002017-04-26T17:53:46.520+02:00Does gcc try examine code and then decide to gener...Does gcc try examine code and then decide to generate accelerator or normal CPU code based upon what it thinks will be useful in any given situation? What does it ensure about semantics? I would expect that optimal performance would often be achieved by having an accelerator perform some operations in parallel with the main CPU; does gcc use "restrict" to determine when that is and is not safe?supercathttps://www.blogger.com/profile/12531904492602532373noreply@blogger.com