Time of computing in milliseconds --------------------------------- Accumulation +*, operand sizes ~25,000 and ~8,000 bits v2.1c softmul hardmul CPU MHz asm32 asm64 NTL C32 C64 C32 C64 --------------------------------------------------------------- ARM S3C2440A 400 130 | ARM920T PXA312 624 80 | Atom N270 1600 2 6 22 | 8 Pentium III/933 933 3 12 30 | 9 Pentium 4C/2400 2400 3 4 16 | 5 Athlon 900 900 2 8 28 | 7 Athlon XP 2500+ 1826 1 4 14 | Athlon 64 X2 3600+ 1900 1 0 4 11 8| 3 1 Athlon 64 X2 3800+ 2000 1 0 3 10 8| 3 1 Athlon 64 X2 4600+ 2400 0 3 9 | Phenom II X3 710 2600 0 0 2 8 6| Phenom II X6 1055T 2800 0 0 2 7 5| 1 0 Phenom II X6 1090T 3200 0 0 2 6 5| 2 0 FX-8150 3600 1 0 1 8 5| 2 1 FX-8320 3500 0 0 1 8 5| 2 0 Core Duo T2500 2000 2 4 13 | Core 2 Duo E6420 2130 1 4 11 | Core 2 Quad Q8200 2330 1 0 4 10 6| 3 1 Core i7-950 3200 1 0 1 6 4| 2 0 Core i7-6800K 3600 0 0 1 5 3| 1 0 Xeon E3-1230 3200 1 0 1 7 6| Xeon E3-1240v3 3400 0 0 1 5 3| 1 0 Athlon 200GE 3200 0 0 1 6 4| 1 0 Ryzen 5 2600 3400 0 0 1 6 4| 1 0 Accuracy of measurement ~1 ms Raising to power 55907856907890 modulo size ~25000 bits v2.1c v2.2 Test: Arif2 (powmod) softmul hardmul hardmul + div CPU MHz asm32 asm64 NTL C32 C64 C32 C64 C32 C64 a32 a64 -------------------------------------------------------------------------------- ARM S3C2440A 400 62000 | | ARM920T PXA312 624 40000 | | Atom N270 1600 2235 1859 5969 |5516 |1590 533 Pentium III/933 933 2906 3494 8578 |6750 | Pentium 4C/2400 2400 2343 906 4297 |3719 |941 734 Athlon 900 900 1892 2123 8051 |5477 | Athlon XP 2500+ 1826 906 1047 4094 | | Athlon 64 X2 3600+ 1900 890 562 1031 3406 1888|2512 1373| Athlon 64 X2 3800+ 2000 813 500 969 3172 1797|2344 1297| Athlon 64 X2 4600+ 2400 672 828 2640 | | Phenom II X3 710 2600 603 344 834 2380 1188| | Phenom II X6 1055T 2800 563 313 703 2203 1094|1610 750|409 108 108 29 Phenom II X6 1090T 3200 484 281 625 1938 953|1406 656|358 95 95 25 FX-8150 3600 477 290 463 2289 1100|1575 767|420 106 95 30 FX-8320 3500 453 266 287 2350 1047|1578 719|419 94 91 25 Core Duo T2500 2000 1437 844 3765 | | Core 2 Duo E6420 2130 1234 735 2937 | | Core 2 Quad Q8200 2330 1125 578 656 2688 1329|1891 922|515 142 358 109 Core i7-950 3200 848 443 307 1686 844|1210 581|346 95 293 84 Core i7-6800K 3600 285 177 289 1342 724|1026 499| Xeon E3-1230 3200 561 374 453 1747 873| | Xeon E3-1240v3 3400 297 172 286 1281 703|1000 485|281 72 104 29 Athlon 200GE 3200 360 188 250 1516 829|1188 563| /22 Ryzen 5 2600 3400 343 172 187 1422 797|1109 547|310 80 85 23 Accuracy of measurement ~10 ms ~1 ms /12 Program Arifexp64x enhances result of test Arif2 in column a64 ~33% faster for Intel Haswell (Core i7-6800K and Xeon E3-1240v3) ~85% faster for AMD Zen (Athlon 200GE) and Zen+ (Ryzen 5 2600) Multiplication, operand sizes ~250,000 and ~65,000 bits Test: Arif3 (*) softmul hardmul CPU MHz asm32 asm64 NTL C32 C64 C32 C64 --------------------------------------------------------------- ARM S3C2440A 400 2200 | ARM920T PXA312 624 1400 | Atom N270 1600 33 109 366 | 130 Pentium III/933 933 55 214 494 | 161 Pentium 4C/2400 2400 45 64 245 | 77 Athlon 900 900 29 135 469 | 125 Athlon XP 2500+ 1826 14 66 230 | Athlon 64 X2 3600+ 1900 12 3 61 189 133| 56 14 Athlon 64 X2 3800+ 2000 11 3 56 175 127| 53 14 Athlon 64 X2 4600+ 2400 9 48 147 | Phenom II X3 710 2600 8 2 44 133 95| Phenom II X6 1055T 2800 7 2 42 122 89| 38 9 Phenom II X6 1090T 3200 6 2 36 106 78| 33 8 FX-8150 3600 7 2 27 137 89| 37 9 FX-8320 3500 6 2 23 131 81| 31 8 Core Duo T2500 2000 28 70 217 | Core 2 Duo E6420 2130 23 70 175 | Core 2 Quad Q8200 2330 21 8 64 161 105| 42 11 Core i7-950 3200 17 6 21 104 67| 27 7 Core i7-6800K 3600 6 2/1 15 87 57| 19 5 Xeon E3-1230 3200 10 4 23 103 81| Xeon E3-1240v3 3400 6 2/1 16 88 55| 19 5 Athlon 200GE 3200 5 1/1 19 105 67| 24 6 Ryzen 5 2600 3400 5 1/1 13 98 63| 22 5 Accuracy of measurement ~1 ms Program Arifexp64x enhances result of test Arif3 in column asm64: ~25% faster for Intel Haswell (Core i7-6800K and Xeon E3-1240v3) ~40% faster for AMD Zen (Athlon 200GE) and Zen+ (Ryzen 5 2600) Division, operand sizes ~250,000 and ~65,000 bits v2.2 Test: Arif4 (/) hardmul + div CPU MHz asm32 asm64 NTL C32 C64 C32 C64 a32 a64 ------------------------------------------------------------------------------- ARM S3C2440A 400 17000 | ARM920T PXA312 624 9000 | Atom N270 1600 570 561 1242 |371 118 Pentium III/933 933 970 1056 2005 | Pentium 4C/2400 2400 611 278 981 |226 163 Athlon 900 900 543 705 1729 | Athlon XP 2500+ 1826 263 345 900 | Athlon 64 X2 3600+ 1900 264 153 305 758 374 | Athlon 64 X2 3800+ 2000 236 141 298 697 358 | Athlon 64 X2 4600+ 2400 195 256 581 | Phenom II X3 710 2600 161 91 233 479 211 | Phenom II X6 1055T 2800 149 84 216 442 197 | 89 19 24 5 Phenom II X6 1090T 3200 128 74 189 386 170 | 78 16 21 4 FX-8150 3600 124 79 142 479 206 | 90 22 22 6 FX-8320 3500 122 75 83 478 203 | 90 20 21 6 Core Duo T2500 2000 359 241 786 | Core 2 Duo E6420 2130 313 205 606 | Core 2 Quad Q8200 2330 286 149 186 553 242 |114 27 79 20 Core i7-950 3200 218 121 89 347 155 | 76 17 64 16 Core i7-6800K 3600 70 48 88 262 126 | Xeon E3-1230 3200 111 78 97 323 150 | Xeon E3-1240v3 3400 72 45 88 252 128 | 63 14 23 6/4 Athlon 200GE 3200 91 47 55 296 152 | Ryzen 5 2600 3400 86 44 52 277 142 | 69 15 19 4/2 Accuracy of measurement ~10 ms ~1 ms Program Arifexp64x enhances result of test Arif4 in column a64: ~25% faster for Intel Haswell (Core i7-6800K and Xeon E3-1240v3) ~2x faster for AMD Zen (Athlon 200GE) and Zen+ (Ryzen 5 2600) Square root, operand size ~330,000 bits Test: Arif5 (sqrt) CPU MHz asm32 asm64 NTL C32 C64 ----------------------------------------------------- ARM S3C2440A 400 32000 ARM920T PXA312 624 12000 Atom N270 1600 2922 23562 3125 Pentium III/933 933 2360 44453 4453 Pentium 4C/2400 2400 1531 11312 1891 Athlon 900 900 1312 27380 4376 Athlon XP 2500+ 1826 656 13391 2141 Athlon 64 X2 3600+ 1900 484 297 13088 1594 780 Athlon 64 X2 3800+ 2000 453 234 12046 1469 750 Athlon 64 X2 4600+ 2400 391 10031 1234 Phenom II X3 710 2600 364 203 10278 1140 563 Phenom II X6 1055T 2800 344 172 9031 1063 531 Phenom II X6 1090T 3200 297 156 7875 921 453 FX-8150 3600 396 321 5898 1094 512 FX-8320 3500 344 219 3474 1002 515 Core Duo T2500 2000 1062 9826 1844 Core 2 Duo E6420 2130 1141 8548 1422 Core 2 Quad Q8200 2330 1047 469 7767 1297 594 Core i7-950 3200 745 390 3658 960 408 Core i7-6800K 3600 331 162 3610 624 295 Xeon E3-1230 3200 530 328 3978 749 406 Xeon E3-1240v3 3400 328 156 3656 562 281 Athlon 200GE 3400 266 131 2283 656 344 Ryzen 5 2600 3400 250 125 2156 609 313 Accuracy of measurement ~10 ms Legend: asm32 - 32 bit code for x86 masm, cBigNumber 1.2b asm64 - 64 bit code for x64 masm, cBigNumber 2.0 NTL - 32+FPU portable C++ code, NTL 5.4 (http://www.shoup.net/ntl) softmul - C++ code with software binary multiplication: C32 - 32 bit portable C++ code, cBigNumber 1.2b or cBigNumber 2.0 for Haswell+/Zen C64 - 64 bit portable C++ code, cBigNumber 2.0 hardmul - C++ code with hardware-based multiplication: C32 - 32 bit portable C++ code, cBigNumber 2.1c C64 - 64 bit portable C++ code, cBigNumber 2.1c hardmul + div - code with hardware-based multiplication and division a32 - 32 bit code for x86 masm, cBigNumber 2.2 a64 - 64 bit code for x64 masm, cBigNumber 2.2 C32 - 32 bit portable C++ code, cBigNumber 2.2 C64 - 64 bit portable C++ code, cBigNumber 2.2 Test code is single-threaded, so it does not use effect of multiple cores. 32 bit code of cBigNumber 1.2b is compiled under Visual C++ 6.0 in Release mode using "Maximize Speed" optimization. 32 bit code of cBigNumber 2.0/2.1c is compiled under Visual C++ 2010 Express in Release mode. 64 bit code of cBigNumber 2.0/2.1c is compiled under Visual C++ 2010 Express with SDK 7.1 x64 in Release mode; code for Haswell+/Zen is compiled under Visual C++ 2015 Community. Code of cBigNumber 2.2 is compiled under Visual C++ 2015 Community. NTL library is compiled with macro NTL_STD_CXX disabled. Code for ARM is compiled under Pocket GCC 3.3.3 -O5 -DNDEBUG. NOTE: Assembler source code is not included into the public distribution of class. Binary code for testing of performance is here: http://www.imach.uran.ru/cbignum/test/Arifrun22.exe For computers with Intel Haswell+/AMD Excavator/AMD Zen: http://www.imach.uran.ru/cbignum/test/Arifrun22x.exe (run it under Windows and send me file Arifrun2.cab from the desktop). When testing on processors Intel Xeon, AMD Phenom, FX and Ryzen it is recommended to disable technologies Intel Turbo Boost, AMD Turbo Core and Core Performance Boost, that is, without increasing of CPU frequency in single-threading mode. Energy saving mode of operation system must be set to Max Performance.