Supported Feathercoin client versions: Current production Version: 0.16.0.

Old Production Version: 0.13.2

Info: If you have problems to post, create one post in the ‘newbies’ category and try again. If you have further problems, find support under Guides or use the shoutbox

NSGminer v0.9.2: The Fastest Feathercoin / NeoScrypt GPU Miner


  • Regular Member

    I still haven’t patched in more precise intensity, but I have managed to improve upon my 01/17/2016 record by around 2.657% - 425kh/s on 7950 at 1050/1500 now. To compare with the 7990, I also ran a test at 1000/1500 - 410kh/s to 411kh/s. I’ll do some tests on power draw later.


  • Regular Member

    @Wolf0 I have optimised the most important XOR in FastKDF already. It was a bottleneck to do it bytewise on GCN. 120K kernel size isn’t very large because Salsa and ChaCha separately fit the code cache and FastKDF has more important issues like memory alignment. I’ll try to optimise it better.


  • Moderators

    I like the idea on optimising on “power efficiency”, not “speed”. 😉


  • Regular Member

    @ghostlander said:

    @Wolf0 I have optimised the most important XOR in FastKDF already. It was a bottleneck to do it bytewise on GCN. 120K kernel size isn’t very large because Salsa and ChaCha separately fit the code cache and FastKDF has more important issues like memory alignment. I’ll try to optimise it better.

    Which XOR would that be? I feel like I’m derping and missing something obvious, but I see the ending XOR with the if/else branch outside the loop, and the XOR inside the loop which is done with a call to neoscrypt_bxor()… I just looked at your current git again, double-checked this, then read the neoscrypt_bxor() function again - it’s still bytewise. Unless you mean something you’ve not pushed, in which case never mind. If you have, then nice - my trick with the aligning the XOR worked out for you.

    Anyways, you seem to be working from the outside in, rather than from the inside out, when it comes to the optimization of the code - the “outside” being the portions with less time spent, and the “inside” being the opposite. You really might want to look into SMix() - that’s where you really can gain hashrate.

    @wrapper said:

    I like the idea on optimising on “power efficiency”, not “speed”. 😉

    They are almost always one in the same in the GPU arena. If I have shitty, slow code, it leaves portions of the GPU unused, or at least under-utilized, causing the lower power consumption people notice. However - if these resources are used well, then the hashrate goes up far more than power does - I actually have records from my really old X11 optimizations to show this, as well as exact percentages taken from runs of the (then) stock X11 shipping with SGMiner and mine on Freya.


  • Regular Member

    @Wolf0 https://github.com/ghostlander/nsgminer/blob/692e2ef2946229cf057dd006c8e85c8674f0342f/neoscrypt.cl#L713

    It’s executed 64 times per hash. The final XOR outside the loop is less important.

    @Wolf0 said:

    Unless you mean something you’ve not pushed, in which case never mind. If you have, then nice - my trick with the aligning the XOR worked out for you.

    Well, I added it to my beta 10 days ago. You have mentioned to do bytewise XOR in uints, I have vectorised it which is also fine. Not uploaded to GitHub yet, but quite a few people use it right now. It’s well improved over the previous release in performance and compatibility. I see only a 5% decrease while switching from 14.6 to 15.7 drivers. It was much worse before (https://bitcointalk.org/index.php?topic=712650.msg13585416#msg13585416).


  • Regular Member

    @ghostlander said:

    @Wolf0 https://github.com/ghostlander/nsgminer/blob/692e2ef2946229cf057dd006c8e85c8674f0342f/neoscrypt.cl#L713

    It’s executed 64 times per hash. The final XOR outside the loop is less important.

    @Wolf0 said:

    Unless you mean something you’ve not pushed, in which case never mind. If you have, then nice - my trick with the aligning the XOR worked out for you.

    Well, I added it to my beta 10 days ago. You have mentioned to do bytewise XOR in uints, I have vectorised it which is also fine. Not uploaded to GitHub yet, but quite a few people use it right now. It’s well improved over the previous release in performance and compatibility. I see only a 5% decrease while switching from 14.6 to 15.7 drivers. It was much worse before (https://bitcointalk.org/index.php?topic=712650.msg13585416#msg13585416).

    OH, lol, yes, that is good, but that was not what I meant! This line:

    [code]
    neoscrypt_bxor(&Bb[bufptr], &T[0], 32);
    [/code]

    I’m saying I did this operation using uints.


  • Regular Member

    @Wolf0 I get it. I’ve also rewritten it. The code quoted is plain bytewise, though old VLIW GPUs like it for some arcane reason.


  • Regular Member

    @ghostlander said:

    @Wolf0 I get it. I’ve also rewritten it. The code quoted is plain bytewise, though old VLIW GPUs like it for some arcane reason.

    Odd. I got my 6970 today, so I should be able to work on Cayman in a while.


  • Regular Member

    I have managed to squeeze 520 and 500Kh/s out of my r9 290s, and up to 420KH/s on my 7950s which is more where I was expecting, all my 7970/280x are between 450 and 500KH/s with the majority being around 500.


  • Regular Member

    I have optimised almost all bytewise parts of FastKDF. 800KH/s before with v7 beta, 820KH/s now (Catalyst 14.6) or 770KH/s (Catalyst 15.7).


  • Regular Member

    Even with beta v7, Im floating around 320KH/s on my 7950s. I only saw a ~20-30KH/s increase.


  • Regular Member

    @AmDD said:

    Even with beta v7, Im floating around 320KH/s on my 7950s. I only saw a ~20-30KH/s increase.

    what driver version are you using?


  • Regular Member

    @RIPPEDDRAGON said:

    @AmDD said:

    Even with beta v7, Im floating around 320KH/s on my 7950s. I only saw a ~20-30KH/s increase.

    what driver version are you using?

    14.7


  • Regular Member

    @AmDD said:

    @RIPPEDDRAGON said:

    @AmDD said:

    Even with beta v7, Im floating around 320KH/s on my 7950s. I only saw a ~20-30KH/s increase.

    what driver version are you using?

    14.7

    weird… i think that is what I am running, plain and simple -w 128 -I 16…I will check tonight

    What clocks are yours set to? Im running 1110/1550 or higher…


  • Regular Member

    @RIPPEDDRAGON said:

    @AmDD said:

    @RIPPEDDRAGON said:

    @AmDD said:

    Even with beta v7, Im floating around 320KH/s on my 7950s. I only saw a ~20-30KH/s increase.

    what driver version are you using?

    14.7

    weird… i think that is what I am running, plain and simple -w 128 -I 16…I will check tonight

    What clocks are yours set to? Im running 1110/1550 or higher…

    Ill have to double check my settings but I think clocks are 1100/1600 or so.


  • Regular Member

    Trying to compile on a custom Puppy Linux install (MinerPup).

    ...
    make[2]: Entering directory `/root/archive/nsgminer'
      CC       nsgminer-miner.o
    In file included from miner.c:66:
    neoscrypt.h:9: error: redefinition of typedef ‘ullong’
    miner.h:34: error: previous declaration of ‘ullong’ was here
    neoscrypt.h:12: error: redefinition of typedef ‘uchar’
    miner.h:30: error: previous declaration of ‘uchar’ was here
    make[2]: *** [nsgminer-miner.o] Error 1
    make[2]: Leaving directory `/root/archive/nsgminer'
    make[1]: *** [all-recursive] Error 1
    make[1]: Leaving directory `/root/archive/nsgminer'
    make: *** [all] Error 2
    

    Any suggestions?

    - UnklAdM


  • Regular Member

    @RIPPEDDRAGON said:

    @AmDD said:

    @RIPPEDDRAGON said:

    @AmDD said:

    Even with beta v7, Im floating around 320KH/s on my 7950s. I only saw a ~20-30KH/s increase.

    what driver version are you using?

    14.7

    weird… i think that is what I am running, plain and simple -w 128 -I 16…I will check tonight

    What clocks are yours set to? Im running 1110/1550 or higher…

    -w 256 -I 13 -g 2 and 1050/1600 clocks on 14.7 drivers. I did see when I got home that the rig had shutdown and had issues booting back up. I reinstalled the drivers and tried -w 128. So far its slower but I’ll let it hash awhile to see what it does.


  • Regular Member

    @UnklAdM said:

    Trying to compile on a custom Puppy Linux install (MinerPup).

    ...
    make[2]: Entering directory `/root/archive/nsgminer'
      CC       nsgminer-miner.o
    In file included from miner.c:66:
    neoscrypt.h:9: error: redefinition of typedef ‘ullong’
    miner.h:34: error: previous declaration of ‘ullong’ was here
    neoscrypt.h:12: error: redefinition of typedef ‘uchar’
    miner.h:30: error: previous declaration of ‘uchar’ was here
    make[2]: *** [nsgminer-miner.o] Error 1
    make[2]: Leaving directory `/root/archive/nsgminer'
    make[1]: *** [all-recursive] Error 1
    make[1]: Leaving directory `/root/archive/nsgminer'
    make: *** [all] Error 2
    

    Any suggestions?

    - UnklAdM

    Edit miner.c and driver-cpu.c to include neoscrypt.h before miner.h, and update typedefs in miner.h to the following:

    #if !(uchar)
    typedef unsigned char uchar;
    #endif
    #if !(uint)
    typedef unsigned int uint;
    #endif
    #if !(ullong)
    typedef unsigned long long ullong;
    #endif


  • Moderators

    @UnklAdM

    In the code a type is defined, that was previously defined in another part/module/file

    Suggesstion:

    Edit miner.h and comment out the lines defining the type ullong and uchar

    Then try again.

    [Edit]
    Ghostlanders way is far more elegant 😃


  • Regular Member

    @Wellenreiter said:

    @UnklAdM

    In the code a type is defined, that was previously defined in another part/module/file

    Suggesstion:

    Edit miner.h and comment out the lines defining the type ullong and uchar

    Then try again.

    [Edit]
    Ghostlanders way is far more elegant 😃

    Tried that, that’s why I’m here. Thanks anyway! I’ll try the other fix when I get to the office

    • UnklAdM.

 

Looks like your connection to Feathercoin Forum was lost, please wait while we try to reconnect.