Further references and notes for my talk on AI democratization at Fossmeet'24
introductory links
- stable diffusion 3 paper
- Inside the nascent industry of AI-designed drugs
- Shaping the future of advanced robotics - deepmind blog article and SARA-RT: Scaling up Robotics Transformers with Self-Adaptive Robust Attention
- SORA
- alphageometry and its github repo
- chatarena leaderboard
low resource AI, efficient resource usage
- mixtral paper
- ternary params LLMs: The Era of 1-bit LLMs: ternary parameters for cost-effective computing
- deepsparse engine
- resources on pruning
- https://github.com/unslothai/unsloth
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
- PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
- flash attention
- luminal
low-rank approximations
- original lora paper
- lora+: Efficient Low Rank Adaptation of Large Models
- VeRA: Vector-based Random Matrix Adaptation
- QLoRA: Efficient Finetuning of Quantized LLMs
- ...
federated learning, distributed training, and herding resources
-
petals: https://github.com/bigscience-workshop/petals, https://petals.dev/
-
herding inference
-
DiLoCo: Distributed Low-Communication Training of Language Models
-
fax - "Scalable and Differentiable Federated Primitives in JAX"
-
https://open-assistant.io/
- https://huggingface.co/OpenAssistant
alt archs
alt sequence modeling architectures
- RWKV
- mamba
- hawk and griffin, r/machinelearning thread
spiking neural nets
- https://en.wikipedia.org/wiki/Spiking_neural_network
hyperdimensional computing
- https://www.hd-computing.com/
- https://en.wikipedia.org/wiki/Hyperdimensional_computing
- https://www.tu-chemnitz.de/etit/proaut/en/research/vsa.html
- https://github.com/HyperdimensionalComputing/collection
tsetlin machines
- original paper
- under-construction book on tsetlin machines
- Tsetlin Machine Unified (TMU) - python package aggregating most top tsetlin-machine advancements
liquid neural nets
- liquid time-constant networks
- https://www.nature.com/articles/s42256-022-00556-7
- https://github.com/raminmh/CfC
- https://www.science.org/doi/10.1126/scirobotics.adc8892
- https://news.mit.edu/2023/drones-navigate-unseen-environments-liquid-neural-networks-0419
- https://www.csail.mit.edu/news/drones-navigate-unseen-environments-liquid-neural-networks
- Liquid Neural Networks | Ramin Hasani | TEDxMIT
more misc architectures
- spiking neural nets
- analog hardware
- metaheuristics, swarm-learning/intelligence, self-organisation and the likes
- predictive coding nets
- large dynamical models and neural ODEs
- hopfield networks
- ...the list is endless
alt hardware
- very chaotic, messy and incomplete blogpost by your speaker: On the pervasive hardware divide
- wikipedia page on AI accelerators
- All about AI Accelerators: GPU, TPU, Dataflow, Near-Memory, Optical, Neuromorphic & more (w/ Author)
- Livestream - P&S Exploring the Processing-in-Memory Paradigm for Future Computing Systems (Spring 2022) and other resources by Onur Mutlu