Skip to content

[RFC]: expand support for additional pseudorandom number generators #136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
7 tasks done
impawstarlight opened this issue Apr 4, 2025 · 3 comments
Open
7 tasks done
Assignees
Labels
2025 2025 GSoC proposal. received feedback A proposal which has received feedback. rfc Project proposal.

Comments

@impawstarlight
Copy link

impawstarlight commented Apr 4, 2025

Full name

Abdul Kaium

University status

Yes

University name

Jahangirnagar University

University program

Computer Science and Engineering

Expected graduation

January 2027

Short biography

I'm a 3rd year undergraduate student of Computer Science & Engineering at Jahangirnagar University, Dhaka, Bangladesh. I've been coding from an early age sparked by a deep curiosity about computers and software, starting with C. Now, I’m proficient in C++ and JavaScript, with experience in several other programming languages. I have a strong passion for competitive problem solving, algorithmic optimizations, GPU computation, and color science.

Timezone

Bangladesh Standard Time (UTC+06:00)

Contact details

email: [email protected], github: impawstarlight

Platform

Linux

Editor

I prefer VSCode for its rich set of features, built-in Git integration, and extensive support for extensions. The editor’s versatility and customizability make it an excellent choice for various projects, whether it's competitive programming, web development, or system-level work. Its fast performance and portability across different platforms also add to its appeal, making it my go-to tool for coding.

Programming experience

My programming journey began in 2017 with C, driven by personal curiosity. I initially used it to solve math homework, find large prime numbers, calculate large factorials, and explore other number-related hobbies. Later that year, I learned C++ and Java to understand object-oriented programming, but I didn't use them much at that time. In 2021, I started learning Javascript (along with HTML and CSS), which opened up a whole new world of possibilities, allowing me to build GUI applications for the first time.

Over the next two years, I focused heavily on vanilla JavaScript, building various mini-projects. My most significant project is Chromania, a color picker and namer tool that supports advanced color spaces like CIELAB and Oklab. Another project worth mentioning is BRUHsimp, an very basic algebraic expression processor for expanding and simplifying moderately complex expressions, which I created to aid my work on Chromania.

For much of this journey, I was coding on a mobile phone due to lack of access to a computer. This limitation sparked my interest in performance optimization techniques, as I had to compensate for the limited processing power of my device.

After enrolling in university in 2023, I started competitive programming (with C and later transitioned to C++), which I found incredibly satisfying as it aligned with my love for math, problem-solving and optimization. I’m currently an Expert on Codeforces and an active participant in nation-wide inter-university programming contests.

JavaScript experience

JavaScript is my go-to language for personal projects, and it's the language I'm most comfortable with. I’ve worked primarily with vanilla JavaScript, handling everything from DOM manipulation to creating complex algorithms manually.

The best part of JavaScript for me is its portability across different devices along with its ability to easily create GUI applications which made it accessible to me even without a computer, and helped me dive deep into programming.

On the downside, I find JavaScript's lack of static typing a bit limiting, especially for larger projects, and its inability to handle 64-bit integer arithmetic natively is something I wish it had.

Node.js experience

I’ve briefly explored Node.js while experimenting with frontend frameworks like React and Svelte, and backend technologies like Express. One fun project I built with Node.js is a Rock-Paper-Scissors game with WebSockets, where two users can play and chat online. I hosted it locally using an ngrok free-tier domain, which gave me hands-on experience with real-time communication between clients and servers.

C/Fortran experience

C was my first programming language, and I’ve spent a significant amount of time with it, particularly for competitive programming. I’m very comfortable with its core concepts, including pointers, dynamic memory management, and meta-programming.

As for Fortran, although I had no prior experience, I’ve looked into some Fortran code and found it relatively accessible, especially when using it as a reference for C or JavaScript implementations. I plan to learn more about Fortran during the community bonding period to better understand its application in scientific computing.

Interest in stdlib

stdlib stands out as a powerful JavaScript alternative to Python’s Numpy, offering efficient utilities for data manipulation, statistical analysis, mathematical computations, and more, all with a clean and modular design. I’m fascinated by its motivation and design principles, and I’m excited about the opportunity to be a part of this wonderful project that has the potential to become "The Standard Library" for high-performance, advanced mathematical computation in JavaScript.

My favorite feature of stdlib is its REPL environment which helped me a lot with debugging while working on my PRNG implementations as it enables us to quickly run stdlib functions from the command line. Another thing I particularly appreciate about stdlib is its large collection of assertion and utility functions that help keep the codebase clean and meaningful.

Version control

Yes

Contributions to stdlib

I have successfully contributed to stdlib with simple patches as well as complex features that are directly related to this project.

Merged PRs:

Open PRs:

All PRs:

stdlib showcase

Simulations with random numbers

Goals

The goal of this project is to add Javascript and C implementations of at least 12 and upto 20 new pseudorandom number generators to the @stdlib/random namespace. The large uncertainty gap in the expected number of new PRNGs is due to the following facts:

  • A lot of historically relevant PRNGs are relatively easy to to implement in general but complexities may arise due to licensing issues or incomplete documentations.
  • Many well-performing modern PRNGs require 64-bit arithmetic that might be slow in Javascript due to the overhead of double-word emulation and need to be carefully balanced against performance vs quality.
  • Variants or structurally similar PRNGs might become easy to implement after the completion of a similar PRNG that is included in the primary goals.
    Hopefully, these confusions will be resolved as the project progresses, resulting in the implementation of a significantly greater number of PRNGs than the minimum specified above. Hence, I'm fairly optimistic that the actual number will be closer to 20 at the end of this project.

Primary Goals

The primary goal consists of implementing 12 new PRNGs in @stdlib/random/base along with additional higher-level wrappers (array, iter, etc.) for each of them. The proposed PRNGs are chosen with a careful evaluation of their statistical propety, performance, historical relevance, and implementation difficulty in Javascript. These are the ones that I've explored in detail and judged to be relevant and readily implementable in stdlib. All the proposed PRNGs below will have 32-bit outputs unless otherwise specified. Additionally, a key focus throughout the project will be on finding an overall best general purpose PRNG among the implemented ones that can serve as the new default PRNG in stdlib.

  1. PCG-XSH-RR 64/32:
    Permuted Congruential Generator, uses a simple Linear Congruential Generator under the hood but applies an ouput permutation to improve randomness quality. The XSH-RR variant is the most widely used one for producing 32-bit outputs.

    • State size: 64-bit
    • Period: 264
    • ✔️ Strong statistical properties despite a simple design.
    • ✔️ One of the rare PRNGs that passes BigCrush with a small state size of only 64-bit.
    • ❌ Requires 64-bit integer arithmetic, which needs to be emulated in JavaScript and may result in slower performance than expected.
  2. PCG-RXS-M-XS 32/32:
    A reduced state variant of the PCG family for use when the state size must be limited to 32-bit. It's a faster alternative to the XSH-RR variant with a small compromise on the statistical quality.

    • State size: 32-bit
    • Period: 232
    • ✔️ Offers the best statistical quality among all PRNGs with a 32-bit state.
    • ✔️ Fast performance due to simple implementation requiring only a few 32-bit operations.
    • ❌ Doesn't pass BigCrush due to the very small state size but passes with a 36-bit state which indicates its superiority over other PRNGs with a 32-bit state.
  3. Philox-4x32-10:
    A counter-based PRNG from the Random123 suite, designed for parallel computation. The 4x32-10 variant is the standard, using four 32-bit words and 10 rounds of Feistel-like mixing.

    • State size: 128-bit (4x32-bit counters)
    • Period: 2128
    • ✔️ Excellent statistical quality - passes BigCrush and PractRand with ease.
    • ✔️ Fully parallelizable due to its stateless, counter-based design.
    • ❌ Might be slightly slower than other PRNGs of similar state size due to use of multiple rounds of multiplication and XORs.
  4. Squares (32-bit):
    A counter-based PRNG that applies multiple rounds of squaring to a counter, based on the concept of von Neumann's middle-square method.

    • State size: 32-bit
    • Period: 232
    • ✔️ Simple, stateless design that can be easily parallelized.
    • ✔️ High-quality randomness that passes many statistical tests, including BigCrush and PractRand.
    • ❌ Requires 64-bit integer arithmetic, which needs to be emulated in JavaScript and may result in slower performance than expected.
  5. Xorshift (32-bit):
    Xorshift is a family of simple and efficient PRNGs that uses only bitwise XORs and shifts to generate pseudo-random numbers. The 32-bit variant has a small state size and is widely used for applications requiring fast and reasonably good random numbers.

    • State size: 32-bit
    • Period: 232 - 1
    • ✔️ Extremely fast due to its simple design involving only a few bitwise operations.
    • ✔️ Low memory usage with just a 32-bit state.
    • ❌ Fails BigCrush tests due to relatively weak randomness quality.
    • ❌ Not suitable for high-quality random number requirements.
  6. Xorwow:
    Xorwow is an extension of the Xorshift family that enhances randomness quality by incorporating an additional mixing step.

    • State size: 192-bit (5x32-bit words + 1x32-bit counter)
    • Period: 2192 - 232
    • ✔️ Better statistical properties than basic Xorshift PRNGs.
    • ✔️ Large period while being very fast and efficient.
    • ✔️ Default PRNG in NVIDIA's cuRAND library for GPU applications.
    • ❌ Fails a few tests in BigCrush but still performs well overall.
  7. WELL512a:
    The smallest variant of the WELL (Well Equidistributed Long-period Linear) family of PRNGs, designed to improve upon the weaknesses of Mersenne Twister.

    • State size: 512-bit (16×32-bit words)
    • Period: 2512 − 1
    • ✔️ Maximally equidistributed, even in high dimensions.
    • ✔️ Passes BigCrush and PractRand.
    • ✔️ Relatively fast.
    • ❌ Larger state size.
  8. WELL1024a:
    A higher-capacity variant of the WELL family with improved period and statistical strength, suitable for applications needing large state spaces.

    • State size: 1024-bit (32×32-bit words)
    • Period: 21024 − 1
    • ✔️ Maximally equidistributed, suitable for high-dimensional simulations.
    • ✔️ Passes BigCrush and PractRand.
    • ❌ Large state size may be overkill for some applications.
  9. MRG32k3a:
    A Combined Multiple Recursive Generator by Pierre L’Ecuyer, designed for high-quality, long-period pseudo-random number generation.

    • State size: 192-bit (6×32-bit words)
    • Period: ≈ 2191
    • ✔️ Passes BigCrush
    • ✔️ Widely used in scientific and simulation contexts (R's rstream).
    • ❌ Slightly complex to implement.
  10. Taus88:
    A maximally equidistributed combined Tausworthe generator that combines 3 LFSRs to yield a reasonably good PRNG with moderate period.

    • State size: 96-bit (3×32-bit words)
    • Period: ≈ 288
    • ✔️ Very fast due to its reliance on only a few bitwise operations.
    • ✔️ Useful in reproducible legacy simulations
    • ❌ Fails BigCrush and is statistically weaker compared to modern alternatives.
    • ❌ Generally considered legacy, not preferred for high-dimensional simulations.
  11. LFSR113:
    An improved variant of the maximally equidistributed combined Tausworthe generators developed by L'Ecuyer, which combines 4 LFSRs.

    • State size: 128 bits (4×32-bit words)
    • Period: ≈ 2113
    • ✔️ Very fast due to simple shift and XOR operations.
    • ✔️ Improved equidistribution and statistical quality over Taus88.
    • ❌ Still fails some tests BigCrush.
    • ❌ Considered legacy by today's standards, with modern alternatives offering higher quality with similar performance.
  12. Wichmann–Hill:
    A combined linear congruential generator combining the outputs of three independent LCGs with prime moduli, yielding double precision floating point values in the interval [0,1).

    • State size: 96-bit (3x32-bit words)
    • Period: Approximately 6.95×1012
    • ✔️ Simple, fast and easy to implement.
    • ✔️ Historically popular and used in early software (older versions of Excel and MATLAB).
    • ❌ Statistical quality is relatively poor by modern standards, doesn't pass BigCrush.
    • ❌ Considered legacy and generally not recommended for moder usage.

Stretch goals:

In case the aforementioned goals are successfully completed way before the project deadline and there's ample time to spare, the following objectives might be persued to further improve the @stdlib/random libray:

  • A default PRNG for 64-bit output: Research for an efficient and high quality 64-bit output PRNG that avoids the use of 64-bit arithmetic, especially 64-bit multiplication since it causes a big performance penalty in Javascript due to the lack of native 64-bit integers which is results in manual double-word emulation. This will be useful for efficiently generating both integers and floating point values with 53-bit precision, something that the V8 engine currently lacks.
  • Implement WELL19937: This might even be converted to a principle objective given the need for a MT19937 alternative with equivalent period and equidistribution properties. As WELL19937 has significantly better equidistribution than MT19937 and a faster recovery from zeroland states (when the state bits contains mostly zeros), it has a strong potential to replace MT19937 in large-perioud high-quality random number needs. But a small downside is that it is slightly slower than MT19937.
  • Implement additional legacy PRNGs: Consider PRNGs that might still be relevant for reproducibilty, academic interest, or historical significance.

Why this project?

Random number generation has always intrigued me, especially the idea that deterministic algorithms can produce seemingly random outputs feels like a paradox. This project offers a perfect opportunity to dive deep into the intricacies of PRNGs and explore how they are designed and optimized. The algorithmic challenges involved in implementing various PRNGs excite me, as they remind me of tackling problems in competitive programming where efficiency and precision are of utmost importance. I’m eager to contribute to this project and deepen my understanding of a critical area of computer science that has broad applications in fields like cryptography, simulations, and gaming.

Qualifications

As a competitive programmer, I have solid experience with algorithmic analysis, bit manipulation and modular arithmetic which are core concepts in PRNG algorithms and that makes this project a natural fit for me. Additionally, I’ve completed academic courses like Algorithms and Data Structures, Discrete Mathematics, and Digital Logic Design, giving me a strong foundation in computational principles.

I’ve also demonstrated my familiarity with the stdlib codebase and PRNG internals by contributing two new PRNG implementations, which are awaiting approval and I’m currently studying research papers by various PRNG designers to understand the core ideas behind their algorithms.

Prior art

The core idea behind this project is well-established in other ecosystems. Python’s random module, NumPy’s Generator and BitGenerator classes, and C++’s header all offer a variety of PRNGs with different tradeoffs in speed, space, and statistical quality. Libraries like rand in Rust or Random123 (from D.E. Shaw Research) take it even further, with a focus on modern, parallel-friendly designs.

Most of the PRNGs proposed in this project have accompanying papers which are listed below:

Additional papers that are useful for this project are listed below:

Useful books:

Commitment

I plan to invest ~30 hr/week on average during the GSoC period for a total of 350 hours. I have no other commitments during this period, which allows me to fully dedicate my time to this project.

Schedule

Assuming a 12 week schedule, here is a draft schedule for realizing the project goals.

  • Community Bonding Period: Preparation

    • Analyze the proposed PRNGs and discuss implementation details, additional API design, and possible challenges with mentors.
    • Gather detailed resources and reference implementations for each PRNG.
    • Get familiar with Fortran and analyze relevant PRNG implementations.
    • Review stdlib development protocol and ensure proper setup of the development environment.
    • Potentially add a package for emulating 64-bit arithmetic with double-word decomposition.
  • Week 1: Xorshift (32-bit) and Xorwow

    • Review and finalize the already implemented Xorshift (32-bit) in random/base/xorshift32 package that is currently awating review.
    • Implement Xorwow with complete documentation and unit tests.
    • Optionally explore additional Xorshift variants or derivatives that could be added with a focus on popularity and reproducibilty needs.
      Potential candidates include Xorshift+, Xoshiro and Xoroshiro family of PRNGs.
  • Week 2: PCG-XSH-RR 64/32 and PCG-RXS-M-XS 32/32

    • Review and finalize the already implemented PCG-XSH-RR 64/32 in random/base/pcg32 package that is currently awating review.
    • Implement PCG-RXS-M-XS 32/32 with complete documentation and unit tests.
    • Optionally explore additional members of the PCG family that could be added based on widespread usage.
      One potential candidate is PCG-RXS-M-XS 64/64 which has the potential to be the overall best PRNG for 53-bit output in stdlib despite requiring 64-bit arithmetic, but further research is needed on this.
  • Week 3: Philox-4x32-10

    • Implement Philox-4x32-10 with complete documentation and unit tests.
    • Optionally explore additional members of the Random123 family such Philox-2x32-10, Threefish, ARS and potentially implement one.
  • Week 4: Squares (32-bit)

    • Implement Squares (32-bit) with complete documentation and unit tests.
    • Perform additional testing for emulated 64-bit arithmetic used in Squares (32-bit)
    • Optionally implement Squares (64-bit) for 53-bit output.
    • Optionally implement Middle-Square Weyl Sequence which is a predecessor Squares by the same author.
  • Week 5: WELL512a and WELL1024a

    • Implement WELL512a with complete documentation and unit tests.
    • Implement WELL1024a with complete documentation and unit tests.
    • Optionally implement WELL19937a which is a strong competitor to MT19937.
  • Week 6: Midterm Review

    • Review all implementaions added up until now and ensure consistency and correctness.
    • Collect detailed feedback and discuss challenges faced and possible changes to project plan if necessary.
    • Document the development process and future plans.
    • Compare and benchmark the implemented PRNGs in search of the best one that can potentially replace MT19937.
  • Week 7: MRG32k3a

    • Implement MRG32k3a with complete documentation and unit tests.
    • Optionally implement MRG32k5a, though this is very low priority as I found it to be very rarely used. Feedback needed on this.
    • Optionally explore additional legacy PRNGs like IBM Randu, UNIX Rand48
  • Week 8: Taus88, LFSR113 and Wichmann-Hill

    • Implement Taus88 with complete documentation and unit tests.
    • Implement LFSR113with complete documentation and unit tests.
    • Implement Wichmann-Hill with complete documentation and unit tests.
    • Optionally explore additional legacy PRNGs such as KISS, R250 and Knuthran family, Borosh, Waterman and Fishman family and find complete documentations if possible.
  • Week 9: Higher-Level Wrappers

    • Add structured package data and setup scaffolding for higher-level wrappers.
    • Generate higher-level wrappers in random/array, ranodm/iter, ranodm/streams and ranodm/strided.
    • Test and verify generated wrappers and fix potential issues.
  • Week 10: A New Default

    • Compare and benchmark all the PRNGs in search of the best one that can potentially replace the current default MT19937.
    • Check compatibility with the new default PRNG and update existing packages.
  • Week 11: Code Freeze

    • Finalize all features and focus on optimization, refactoring and cleanup.
    • Update documentation for the base package.
    • Ensure 100% test coverage.
  • Week 12: Final Review and Wrap-up

    • Revisit all previous implementations and ensure consistency and correctness.
    • Test reproducibilty and performance with respect to reference implementations.
    • Write final development report describing the process, chanllenges, findings and plan for future works that might be pursued after the GSoC program.
  • Final Week:

    • Adress potenitial issues and feedback.
    • Submit final works.

Notes:

  • Each week will deliver at least 1 pull request with a new PRNG implementation.
  • Optional goals are mentioned in each week based on the relevant topic of that week but they could be pursued in other orders.
  • It may look a bit overrun with optional goals but that just reflects the fundamental uncertainty that can't be resolved without diving deep into the details of each PRNG.
  • Hopefully, about half of the optional goals will be realized along with the primary goals upon the completion of this project.

Related issues

Checklist

  • I have read and understood the Code of Conduct.
  • I have read and understood the application materials found in this repository.
  • I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
  • I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
  • I have read and understood the stdlib showcase requirement which is necessary for my application to be considered for acceptance.
  • The issue name begins with [RFC]: and succinctly describes your proposal.
  • I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.
@impawstarlight
Copy link
Author

@kgryte I have completed the draft proposal. Please take a look.

@kgryte
Copy link
Member

kgryte commented Apr 8, 2025

@impawstarlight Thank you for this RFC. In general, everything looks good to me. Thank you for the various links and reference material. Definitely clear that you did a fair amount of research on this topic. Your timeline seems reasonable to me, and the primary PRNGs all look reasonable.

Perhaps one comment. Should we find a PRNG which we believe should supplant MT19937 as the default PRNG in stdlib, we'll likely need to leave some time for updating various packages which assume MT19937 (e.g., seed options, state format, etc).

@kgryte kgryte added the received feedback A proposal which has received feedback. label Apr 8, 2025
@impawstarlight
Copy link
Author

@kgryte Thank you very much for the kind feedback.

Perhaps one comment. Should we find a PRNG which we believe should supplant MT19937 as the default PRNG in stdlib, we'll likely need to leave some time for updating various packages which assume MT19937 (e.g., seed options, state format, etc).

Well in that case, I will put away some time for this within week 9-11 and possibly merge some simpler PRNGs into other weeks to complete the PRNG implementations as early as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2025 2025 GSoC proposal. received feedback A proposal which has received feedback. rfc Project proposal.
Projects
None yet
Development

No branches or pull requests

2 participants