Sunday, December 17, 2023

[eoaqgkaq] proportion of RSA numbers

define an RSA number to be a number that factors into exactly two prime numbers, the larger being less than twice the smaller.

isrsa(n)=my(f=factorint(n)); my(s=matsize(f)); if(2!=s[1], return(0), if(f[1,2]!=1 || f[2,2]!=1, return(0), if(f[1,1]*2<f[2,1], return(0)))); 1;

here are the first few RSA numbers:

? for(i=0, 256, if(isrsa(i), print(i," ",factorint(i))))
6 [2, 1; 3, 1]
15 [3, 1; 5, 1]
35 [5, 1; 7, 1]
77 [7, 1; 11, 1]
91 [7, 1; 13, 1]
143 [11, 1; 13, 1]
187 [11, 1; 17, 1]
209 [11, 1; 19, 1]
221 [13, 1; 17, 1]
247 [13, 1; 19, 1]

for a given range of numbers, what proportion are RSA numbers?  this seems like this could be estimated analytically (perhaps with the Dickman function, for related work, see Tao), but we do Monte Carlo in Pari/GP, using parfor for parallel computation.

in order to speed up computation by a factor of 2, we assume all even numbers are not RSA numbers.  however, when testing small numbers, 6 needs to be special cased.  even though printing is done in the sequential block of parfor, the results do not come out in order.

the first few results below can be verified with the above list of RSA numbers.  among the 2-bit numbers with most significant bit set, namely 2 and 3, none are RSA numbers.  among the 5-bit numbers with MSB set, 16-31, none are RSA numbers.  among the 32 6-bit numbers with MSB set, 32-63, 1 (35) is an RSA number: 1e6/32 = 31250.  among the 8 4-bit numbers with MSB set, 8-15, 1 (15) is an RSA number: 1e6/8 = 125000.  among the 128 8-bit numbers with MSB set, 128-255, 5 (143, 187, 209, 221, 247) are RSA numbers: 1e6/128*5 = 39062.5.  among the 1-bit number with MSB set, 1, none are RSA numbers.  among the 64 7-bit numbers with MSB set, 64-127, 2 (35, 77) are RSA numbers: 1e6/64*2 = 31250.  among the 4 3-bit numbers with MSB set, 4-7, 1 (6) is an RSA number: 1e6/4 = 250000.

first batch (actually done second), up to 55 bits:

? gettime; export(isrsa); my(ncount=1000000); parfor(bitsize=1, 55, my(s=0); for(i=1, ncount, my(n=random(2^(bitsize-1))+2^(bitsize-1)); if((n!=6)&&(n%2==0), next); if(isrsa(n), s+=1)); s, x, print("dataline "bitsize" "x" "ncount" "gettime))
dataline 2 0 1000000 18573
dataline 5 0 1000000 1904
dataline 6 31267 1000000 721
dataline 4 124853 1000000 161
dataline 8 39461 1000000 429
dataline 1 0 1000000 657
dataline 7 31457 1000000 619
dataline 3 249848 1000000 3514
dataline 10 19328 1000000 15600
dataline 9 19521 1000000 358
dataline 13 15826 1000000 771
dataline 11 18540 1000000 435
dataline 12 17816 1000000 656
dataline 15 11376 1000000 1572
dataline 14 12527 1000000 70
dataline 16 10891 1000000 2589
dataline 17 9477 1000000 15102
dataline 20 7257 1000000 2855
dataline 18 8500 1000000 1187
dataline 19 7705 1000000 1742
dataline 21 6747 1000000 2350
dataline 22 5958 1000000 1498
dataline 23 5487 1000000 2263
dataline 24 5089 1000000 2358
dataline 25 4625 1000000 15038
dataline 26 4297 1000000 5640
dataline 27 3963 1000000 2492
dataline 28 3790 1000000 3167
dataline 29 3613 1000000 3506
dataline 30 3317 1000000 6984
dataline 31 3043 1000000 6711
dataline 32 2769 1000000 8230
dataline 33 2616 1000000 23971
dataline 34 2583 1000000 18296
dataline 35 2467 1000000 10707
dataline 36 2216 1000000 8008
dataline 37 2173 1000000 17621
dataline 38 2078 1000000 16277
dataline 39 1908 1000000 25297
dataline 40 1863 1000000 22039
dataline 41 1790 1000000 39006
dataline 42 1694 1000000 42367
dataline 43 1593 1000000 38054
dataline 44 1524 1000000 35728
dataline 45 1499 1000000 49830
dataline 46 1410 1000000 48127
dataline 47 1351 1000000 62108
dataline 48 1282 1000000 65871
dataline 49 1188 1000000 76204
dataline 50 1231 1000000 71772
dataline 51 1122 1000000 66634
dataline 52 1074 1000000 44500
dataline 53 1029 1000000 56539
dataline 54 1007 1000000 37354
dataline 55 954 1000000 22767
cpu time = 17min, 8,807 ms, real time = 5min, 50,115 ms.

for example, the last line above should be read: we randomly sampled 1000000 55-bit numbers with MSB set.  954 were RSA numbers = 0.000954 = 0.0954% .  22767 milliseconds of CPU time had been expended since the previous line was printed (though gettime is mostly meaningless in this parallel computation).

second batch:

? gettime; export(isrsa); my(ncount=1000000); parfor(bitsize=56, +oo, my(s=0); for(i=1, ncount, my(n=random(2^(bitsize-1))+2^(bitsize-1)); if(n%2==0, next); if(isrsa(n), s+=1)); s, x, print("dataline "bitsize" "x" "ncount" "gettime))
dataline 56 928 1000000 967649
dataline 57 913 1000000 139615
dataline 58 887 1000000 154693
dataline 60 841 1000000 74465
dataline 59 837 1000000 107296
dataline 61 850 1000000 11812
dataline 62 763 1000000 70641
dataline 63 726 1000000 78389
dataline 64 722 1000000 1068514
dataline 65 690 1000000 446623
dataline 66 638 1000000 266677
dataline 67 694 1000000 211371
dataline 68 641 1000000 253129
dataline 69 619 1000000 164697
dataline 70 557 1000000 221582
dataline 71 577 1000000 253060
dataline 72 583 1000000 1208599
dataline 73 511 1000000 584578
dataline 74 502 1000000 448724
dataline 75 498 1000000 380748
dataline 76 499 1000000 391445
dataline 77 470 1000000 382709
dataline 78 471 1000000 403028
dataline 79 483 1000000 458870
dataline 80 495 1000000 1413907
dataline 81 431 1000000 791296
dataline 82 436 1000000 694658
dataline 83 411 1000000 649589
dataline 84 412 1000000 550870
dataline 85 399 1000000 621967
dataline 86 407 1000000 650454
dataline 87 399 1000000 717952
dataline 88 392 1000000 1735925
dataline 89 342 1000000 1041960
dataline 90 352 1000000 996877
dataline 91 346 1000000 1002713
dataline 92 341 1000000 860047
dataline 93 304 1000000 955470
dataline 94 333 1000000 1012760
dataline 95 329 1000000 1054538
dataline 96 299 1000000 2080883
dataline 97 299 1000000 1574054
dataline 98 283 1000000 1497025
dataline 99 300 1000000 1541806
dataline 100 313 1000000 1266607
dataline 101 287 1000000 1577068
dataline 102 289 1000000 1557525
dataline 103 265 1000000 1650203
dataline 104 299 1000000 2721996
dataline 105 268 1000000 2171018
dataline 106 261 1000000 2138428
dataline 107 248 1000000 2464619
dataline 108 273 1000000 2069371
dataline 109 233 1000000 2549832
dataline 110 228 1000000 2507602
dataline 111 242 1000000 2683309
dataline 112 225 1000000 3787277
dataline 113 223 1000000 3385594
dataline 114 181 1000000 3457664
dataline 115 212 1000000 3773621
dataline 116 235 1000000 3241294
dataline 117 212 1000000 4039748
dataline 118 196 1000000 4076002
dataline 119 197 1000000 4216571
dataline 120 191 1000000 6046672
dataline 121 205 1000000 5242164
dataline 122 193 1000000 5826027
dataline 123 184 1000000 5985150
dataline 124 193 1000000 5220699
dataline 125 183 1000000 6394398
dataline 126 166 1000000 6804860
dataline 127 182 1000000 7052941
dataline 128 199 1000000 9081431
dataline 129 164 1000000 9493346
dataline 130 161 1000000 9527733
dataline 131 151 1000000 10311288
dataline 132 177 1000000 9186555
dataline 133 169 1000000 11124547
dataline 134 178 1000000 12562868
dataline 135 154 1000000 12741479
dataline 136 167 1000000 15479663
dataline 137 157 1000000 15436601
dataline 138 155 1000000 16690681
dataline 139 148 1000000 17277977
dataline 140 137 1000000 19504895
dataline 141 162 1000000 18781695
dataline 142 138 1000000 22060326
dataline 143 152 1000000 22222123
dataline 144 134 1000000 26356486
dataline 145 150 1000000 24985983
dataline 146 159 1000000 27665757
dataline 147 155 1000000 29276993
dataline 148 124 1000000 31849872
dataline 149 127 1000000 30683069
dataline 150 126 1000000 35491505
*** parfor: user interrupt after 163h, 51min, 56,082 ms cpu time, 21h, 51min, 54,044 ms real time

No comments :