[Dev] NeoScrypt GPU Miner - Public Beta Test
-
Hi All,
I think it’s better to look into the current code:
If you compile it using CodeXL 1.5 with optimization disabled, the strachreg number is huge: more than 1500.
If open the optimization, the strachreg unumber is more than 400.
The key issue for neoscrypt is: it uses too much dynamic copy: calculating the buffer position for B and A.
It’s hard to make full use of the uint4 and opencl compiler nees lots of VREGS to calculate the next buffer position.
The second difficulty is: It uses too much local array and the result is indexing local array will hurt the opencl code’s performance.
Because of the above 2, the reginsers will be used very quickly and memory will spill to global memory.
Directly convert the c code to opencl is just the very first step, need to reduce the strachreg number to 0.
I have rewrite the original code and reduce the strachreg number to 372 in non-optimization mode.
I’m not modifying the cgminer3.7.2, but adding code into original cpuminer and simpliy it. (Current, working on Win8.1 only)
Without the optimization, my code on R9290 can run 95K/s and with 5 R9290, I got 440k/s on pxc.theblocksfactory.
The code can run 145k/s if open the optimization and the Strachreg number reduce to around 220.
But unfortunely, it will reproduce wrong nonce with very wired mid value. I am still testing it.
Like to discuss any techknowledge with all of you.
Thanks
Ralph
-
Hi,
I am the current maintainer of the neoscrypt cgminer. I am fully aware, that the kernel is sub-optimal. Actually the focus upto now was on getting it running and working with pools and wallets. Optimizing the kernel is done in the not so far feature. I already have some ideas.
Neoscrypt is especially designed to be memory (access) intensive. First of all to harden the use of ASICS. I know that the GPU has a burden with that kernel and I already had a struggle to make the neoscrpyt kernel fit for running there.
I haven’t done any profiling yet and it will be some more days before I have the time to do it.
But a question arises: You write, that you add the kernel to a cpuminer. How that? Did you also add all opencl-management code to the cpuminer? Why that double burden? I know that cgminer from a software design/ architecture perspective is bad, but adding everything anew to cpuminer seems like duplicated work.
Regards,
Andre
-
Hi,
I’m so glad to meet you here!!! :)
Actually, I played with original cgminer for a while, but I found it contains too much code that I do not need.
So this time, I just created simple opencl initialization and exection code into cpuminer, orginally neoscrypt cpuminer, then got it working.
(I like the neoscrypt cpuminer code, simple, easy to understand and maintain.)
(Sorry for the neoscrypt code, I removed lots of unnecessary code, to my understanding, compare to the original one.)
As the result, the code is very much customized for my R9 290 and does not have hardware monitor function.
For me, I want to have a very thin gpu miner for neoscrypt and easy to debug and code.
For neoscrypt that desired to asginst ASIC, I think it will not work that well as what people expected.
I write FPGA code for PTS(For fun :)) and knows the current memory operation for FPGA is not a problem and as long as people can invest it, the ASIC can be made very soon.
Thanks
Regards.
Ralph
-
Does anybody have a link for the latest version 3.7.6b for windows.
When i build it everything goes fine, no errors but i get no accepted shares.
Using 13.12 with amd sdk 2.9 win8.1 mingw
Thanks,
Pete
-
So this time, I just created simple opencl initialization and exection code into cpuminer, orginally neoscrypt cpuminer, then got it working.
(I like the neoscrypt cpuminer code, simple, easy to understand and maintain.)
(Sorry for the neoscrypt code, I removed lots of unnecessary code, to my understanding, compare to the original one.)If the unnecessary code you’re talking about is SHA-256 and BLAKE-256, these parts may be removed. SHA-256 is for backward compatibility with Scrypt. BLAKE-256 was for testing purposes until FastKDF and BLAKE2s were implemented. Good luck with your coding. CPUminer is also a work in progress.
-
That looks quite fine. Please retry mining setting intensity significantly lower, I.e. to 8 or even less (-i 8)
-
Have the same problem, compiles fine but get no accepted shares.
I need lo lower intensity to 6 to get accepted and the i get 2.5kh/s or so
with 3.7.5 i get about 50 kh/s with -I 11
-
Hi, anybody having the problem of 3.7.5 stops working after 3-4 hours? On win7 x64
-
Hi raintowers,
can you run the miner with -TDP and log the output to a file? Send me the about last 1000 lines of the file, please.
-
Could you open an issue at https://github.com/vehre/neo-gpuminer/issues for every issue that occurs?
This helps to keep track about problems, when they occur, and if they have been processed, resolved and when.
Thanks for helping.
-
OK I’ve done it. I’m a Network Admin but been out of it for a long time. I have some tech skills but not much on the programming side. Will help as much as I can.
-
Sorry for being imprecise. I am interested in the about last 1000 lines of the log, when the miner aborts on its own after running for 3-5 hours.
Please add those lines to the tickets on github.
-
Well that’s interesting. I’ve been running since yesterday with the -TDP options and it hasn’t frozen once… Still going. I wonder what showing verbose output has to do with it? I’ll keep you updated.
-
Yes I also can’t run the 3.7.6x (only get HW errors) but 3.7.5 is fine.
I’m running windows 7 64bit with AMD HD7800 deries card, can’t find the receiept with the exact moel on it at the moment, what are you running.
-
No still only HW errors for me on 3.7.6 unfortunately
I will be offline for the next 6 days after tonight, so won’t be able to provide any otehr feedback until most likely next weekend
-
Deleted the bin file, but mine is called neoscrypt140909Pitcairngtc256w256l4
Also tried replacing the cl file with my version from when I compiled 3.7.5, but it made no difference
-
Got a set of 7950’s running great at 75Kh/s. Seems that with these GPU’s the lower the -w value the better. GPU 3 is an R9 290 which I’ve only gotten to hash around 78Kh/s so far (-w 24, --thread-concurrency 9856 to prevent failing on launch) Still trying to figure out some optimal settings for it. With all 4 gpu’s running I’m hashing 303Kh/s @ 750 watts.
Here’s what I’m using:
cgminer.exe --neoscrypt -I 13 -w 32
Some screenshots…
http://i.imgur.com/vkDMpht.jpg
-
I was always under the impression that the 280X had comparable performance with the regular 290 - which in the case of your results would be true. As far as power consumption, I’ve had great luck with MSI Afterburner undervolting my 290’s to save power. If you haven’t tried it yet, I highly recommend playing with it to see if you can find a good stable lower voltage. I run my Hynix ram 290’s with -62 for the voltage. It really depends on which type of memory you have in yours. I haven’t had very good luck with Elpida ram and undervolting.
-
Maybe try going back to an older CC version. My 2 rigs have 13.11 and 13.12 respectively - however, I am running Win 7 64-bit, not Windows 8.1. Uninstall 14.4, then run theDDU software in safe mode to make sure you’ve cleaned up all the old crap. Install one of the older CC versions and for troubleshooting purposes you should also reset any OC tweaks you may have configured. You should also take a look at the .bin file that cgminer is creating and look at the “tc” setting. Sometimes having a thread-concurrency setting that is too high or low can cause a ton of HW errors. That’s my best guess anyway.
-
Bring the work back up to 24 now that you’re on 13.12 - I bet you’ll start hashing in the upper 70’s. On an entirely different note, my 290 rig seems to perform exactly the same with lower gpu clock and memory speeds. I’m using MSI Afterburner to underclock and undervolt everything. 947 clock, 1250 mem, and -100 core voltage. Dropped power consumption quite a bit while not affecting the hash rate at all.