It's Vegabook here (Thomas Browne) and I wanted to provide a few points of perspective, on my interest in BQN on the Beam, some of which I have already provided on Matrix. I provide this not because of me, per se, but because I believe that a non-trivial subset of data scientists will have similar perspectives, so my background may be useful. The below summarises some of my points in my chat with @Gander (Rowan Cannaday) two days ago (Friday 29 Dec 2023).
**Background**
I am neither a systems programmer, nor a programming language designer. That said I have [been around](https://stackoverflow.com/questions/993984/what-are-the-advantages-of-numpy-over-regular-python-lists) in the world of vectorised Python for a long time, and know it and R very well. The latter's first-class vectors are a big draw for me. I work extensively with relationships between time series so vectors and indeed matrices, even tensors, linear algebra, are a big focus. I do not yet know BQN in any serious way but I have read Marshall Lochbaum and Dzaima's conversations, and website materials quite extensively, and have played with Dyalog APL in the past. So I'm not completely and utterly green.
**Interest in BQN**
R in particular opens up the possibility of very concise code operating on large mathematical topologies, which is not only pleasant in the sense of power it purveys, but usually it is semantically much clearer what one is trying to achieve, mathematically, than the constructs available in imperative programming languages. So I'm sold on array programming.
I have certain issues with mathematical notation, so something that is still concise and powerful, but maps to computing more explicitly, makes complete sense.
Marshall seems to have created something extremely rigorous in its thought processes and designs, and the CBQN implementation looks seriously capable of taking on the incumbents in terms of performance. Unfortunately, it is my view that while performance is not always a limiting factor, in my domain with often millions, sometimes billions, of data points, performance has to be within one order of magnitude of the competitors (Numpy and R).
**Interest in the BEAM**
I work with streaming data extensively. Inevitably, and across domains, it is my experience that the older the data, the exponentially less useful it becomes. "Live" data is where the real opportunity lies, and for this Python severely compromised, and R is essentially absent from even trying. Async in Python is fundamentally cooperative, with all its non-deterministic downsides, and (mostly) single threaded (the multiprocessing module has its own large set of problems). R doesn't even attempt to be competitive in Async. Both can be thought of primarily as "batch" languages. Languages which do not have a REPL do not qualify for exploratory data science, in my opinion, and functional languages which do (Haskell, OCaml) but map their pre-emptive multitasking capabilities to POSIX threads, leave a lot of lightweight concurrency capability unaddressed.
Enter the BEAM, which is designed first for systems which are "alive". Without going into large amounts of waffling about what makes it great, let's just say that working with it (which has a learning curve) is a revelation for anybody working with any system that is in "live" operation, and that includes ingesting and transforming real-time data of any kind, and in complex systems with many moving parts.
Naturally, latency "guarantees" extract a significant performance cost, and in most benchmarks the BEAM is around Python-level speed (sans Numpy). This makes it completely unsuited to data science, unless one binds into C libraries.
_A marriange of BQN and the BEAM therefore seems to combine two languages which are incredibly interesting and powerful in their own rights, but occupy fairly orthogonal feature spaces, both of which are IMO potentially interesting to many data scientists._
**Databases**
As an aside, I have worked with people who were experts in KDB/Q, and the tight integration between code and data in that (sadly very expensive) stack meant that for time series work these programmers were able to perform almost magical feats. I have extensive experience with row- and column-oriented databases of all kinds, and there's probably an opportunity to do something with BQN in this space.
**Implementation**
I think Rowan has summarised the tradeoffs pretty well. I'll only say that it would be good to have fairly mirroring of BEAM semantics, namely, the ability to "spawn" BQN instances, even if, naturally, these would likely be much more heavyweight than the fine granularity of BEAM processes. My interest is in having BEAM ingest large amounts of real time data, and having BQN do periodic "minibatch" algorithmic model re-calibration, and also, to be able to use BQN for data exploration within the BEAM live environment. I have no idea of how BEAM types will map to BQN types, but it goes without saying that this will have to be efficient in order not to incur large [de]serialization issues.
**My contribution**
I am working on a modular library (nominally caled [BLXX](https://github.com/vegabook/blxx) for now) for bringing in live data from APIs into the BEAM. The first API I am implementing is one for the Bloomberg Terminal, to which I have a subscription. The Bloomberg terminal provides vast breadth and depth of data, and this library will afford the opportunity to test "BeamQN" extensively with real, large amounts of data. I plan to use this capability to climb the BQN learning curve with the motivation that I will not just be working with toy data. This will be the real deal and the output I plan to use actually performing useful work. Please note that finance is one domain I work in, but that the library I am working on will be generic enough to work on any form of API-based streaming data that also allows for querying of historic data points. I am designing it with "behaviours" (~"interfaces" in other languages) that will allow for others to add streaming data sources in a consistent way.
As I polish up my rusty C/Assembler skills I may become more directly useful in actual coding for this project, in the months ahead.
Meantime I am excited enough in the possibilities of "BeamQN" that I am prepared to devote serious time to testing, documentation, and real world marketing / evangelism.
Thomas