[Dev] NeoScrypt GPU Miner - Public Beta Test
-
I was always under the impression that the 280X had comparable performance with the regular 290 - which in the case of your results would be true. As far as power consumption, I’ve had great luck with MSI Afterburner undervolting my 290’s to save power. If you haven’t tried it yet, I highly recommend playing with it to see if you can find a good stable lower voltage. I run my Hynix ram 290’s with -62 for the voltage. It really depends on which type of memory you have in yours. I haven’t had very good luck with Elpida ram and undervolting.
-
Maybe try going back to an older CC version. My 2 rigs have 13.11 and 13.12 respectively - however, I am running Win 7 64-bit, not Windows 8.1. Uninstall 14.4, then run theDDU software in safe mode to make sure you’ve cleaned up all the old crap. Install one of the older CC versions and for troubleshooting purposes you should also reset any OC tweaks you may have configured. You should also take a look at the .bin file that cgminer is creating and look at the “tc” setting. Sometimes having a thread-concurrency setting that is too high or low can cause a ton of HW errors. That’s my best guess anyway.
-
Bring the work back up to 24 now that you’re on 13.12 - I bet you’ll start hashing in the upper 70’s. On an entirely different note, my 290 rig seems to perform exactly the same with lower gpu clock and memory speeds. I’m using MSI Afterburner to underclock and undervolt everything. 947 clock, 1250 mem, and -100 core voltage. Dropped power consumption quite a bit while not affecting the hash rate at all.
-
my amd driver crashes when using -w 24 -I 12…
radeon 7990, driver 13.something, although getting nearly 140 khs
-
Currently getting about 80 kh/s with 3.7.6b, cl file from 3.7.5, *.bin generated with drivers 13.12. And currently running on 14.9
Sapphire 280x with settings
cgminer.exe --neoscrypt -I 13 -w 32 --thread-concurrency 8192
definitely something wrong with the *.cl file from 3.7.6b, no accepted shares unless i lower intensity to 6 but then no performance.
link to my files working on 280x
https://mega.co.nz/#!YFli0LQR!7RjxmzHq1b0Vo0Afr107gNq51rIx1GXSxQOTYfw9_Z8
-
nope… the miner freezes here after a while :(
it starts off well and after five minutes this is what i get
or
what should i do?
-
nope… the miner freezes here after a while :(
it starts off well and after five minutes this is what i get
or
what should i do?
I’ve been having the same problems with freezing. I use the -T option and it keeps running with -w 256 -I 11 --gpu-threads 2. We’ve been discussing options here https://github.com/vehre/neo-gpuminer/issues/3
-
thanks! Looks good for now… Lets see for how long it goes :)
148 khs, WU 10…
EDIT: been working stable for a couple of hours now! I am not entirely sure but I think the p2pool…de is paying extra pxc for finding a block :)
-
-T -w 48 -I 13 works best for me so far… i wish I knew that -T option months ago :D
-
as i see this is only change between them???:
- (get_global_id(0)% CONCURRENT_THREADS)];
- (get_global_id(0)% WORKGROUPSIZE)];
Correct, this is the only change direct change in the kernel. May be I have missed something in the miner corelating with this.
Advice: when you use config-file for running the miner, then please remove the “thread-concurrency” : N from the config-file, if N is smaller then the worksize!
-
(opt_neoscrypt|| opt_scrypt)? 84: sizeof(work->data), true))) {
have no idea how it compares to setting of my miner but i get most stable and fast hash with value -w 84
That line of code has certainly nothing to do with the worksize. The line you have copied there, is taking care about correct communication when solo mining.
-
So just to wrap it up:
-w should be the preferred worksize of the GPU. This is usually a value evenly divideable by 64 and I haven’t found a GPU yet, where this value is beneath 256. That is why cgminer chooses 256 as the default for worksize.
The HW errors occuring with 3.7.6x are most likely due to me failing to ensure, that the thread-concurrency had a value equal to or greater than the worksize. I am currently working on a version where I get rid of the thread-concurrency completely and make use of the worksize only. Thread concurrency is of no use in neoscrypt and makes sense in scrypt only (at least to me).
My current setting on an Nvidia Geforce 218 is:
-w 512
The thread-concurrency is implicitly set to 512 currently. My intensity is set to default (dynamic).
I don’t use configuration files, but only command-line settings. When you use configuration files, make a backup and set worksize back to 256 or 512 depending on what your gpu prefers (removing the line completely makes cgminer select the devices preferred value). Next make sure thread-concurrency is set to worksize or a greater value (again removing the line, make cgminer use a reasonable default value). Setting thread concurrency to a value significantly greater than the worksize wastes memory only.
-
Considering cgminer is dead for GPU mining on newer versions, will this be added to sgminer instead in the future? I think having a common ground like that would help everyone instead of many different forks.
-
Considering cgminer is dead for GPU mining on newer versions, will this be added to sgminer instead in the future? I think having a common ground like that would help everyone instead of many different forks.
Would also love to see NeoScrypt ported to sgminer. Sgminer is much less trouble to setup, less failure.
-
Hello Everyone,
I’m back. O0
After 4 day and nights, finally got my neoscrypt code being optimized successfully by lovely opencl compiler.
The current relsult is: ScrachReg reduced to 224 and the overall hash rate for R9 290 is 160-170K/s. :)
With 5 R9290, I got around 800-830k/s locally and 780-800k/s on PXC.theblocksfactory.
link: http://i58.tinypic.com/noxd9h.jpg
My rig:
5 ASUS R9 290
Win8.1
GPU: default core and memory frequency.
AppSDK: 2.9.1
Crystal Driver: 14.4
Plus: Coding opencl is really nightmare: Comment one line or add one useless line will cause the result 100% different.
Sorry for my national holiday, but the result is exciting.
Love Neoscrypt, hate opencl code but enjoying the fun.
Ralph
-
I can not express how greatfull I am for everyones work here!
This is monumental. The newsletter will go out ASAP, the thread is primed…
BRING ON THE STRESS TESTING!!! sorry for the caps but this had to be yelled.
-
Looks like ill need to do a build on my linux boxes
-
Would also love to see NeoScrypt ported to sgminer. Sgminer is much less trouble to setup, less failure.
The issue is not porting the cl-kernel to sgminer, but the changes neccessary to cope with the changes in the network protocol.
-
Plus: Coding opencl is really nightmare: Comment one line or add one useless line will cause the result 100% different.
Sorry for my national holiday, but the result is exciting.
I totally agree, coding OpenCL is a nightmare. Unfortunately I can explain the theory, why this happens: The SIMD modell is playing against us. Adding or deleting instructions, that threads have to skip or used to sync execution heavily plays into overall performance.
Let’s look at the code at the end of fastkdf():
if (a >= output_len)
// copy
else
// merge
Now “a” depends on the input data, the chances that for a bunch of threads trying to execute this conditional on multiple (different) data - remember every thread has its own distinct data - makes some threads execute the then part, while others do the merge part. SIMD now dictates that all threads execute the same instruction or skip it. Or with other words: All threads execute both parts of the conditional. Now, OpenCL is able to switch off some of the threads, i.e., the threads sees the instruction, but does not execute it. It idles. The compiler tries to handle this, but is not always successfull.
So if just one thread needs to execute the other part of the contidional than all other threads, then nevertheless all threads will step through all instructions in both the then and the else branch.
So far just for the background. :)
-
I had tested 3.7.7b , it work normal.