https://www.beren.io/2023-04-11-Scaffolded-LLMs-natural-language-computers/
https://queue.acm.org/detail.cfm?id=3623391
The team at NVIDIA brings confidentiality and integrity to user code and data for accelerated computing.
Mistral-7B-v0.1 is a small, yet powerful model adaptable to many use-cases. Mistral 7B is better than Llama 2 13B on all benchmarks, has natural coding abilities, and 8k sequence length. It’s released under Apache 2.0 licence. We made it easy to deploy on any cloud, and of course on your gaming GPU.
https://about.fb.com/news/2023/09/introducing-ai-powered-assistants-characters-and-creative-tools/
- We’re starting to roll out AI stickers across our apps, and soon you’ll be able to edit your images or even co-create them with friends on Instagram using our new AI editing tools, restyle and backdrop.
- We’re introducing Meta AI in beta, an advanced conversational assistant that’s available on WhatsApp, Messenger, and Instagram, and is coming to Ray-Ban Meta smart glasses and Quest 3. Meta AI can give you real-time information and generate photorealistic images from your text prompts in seconds to share with friends. (Available in the US only)
- We’re also launching 28 more AIs in beta, with unique interests and personalities. Some are played by cultural icons and influencers, including Snoop Dogg, Tom Brady, Kendall Jenner, and Naomi Osaka.
- Over time, we’re making AIs for businesses and creators available, and releasing our AI studio for people and developers to build their own AIs.
- These new AI experiences also come with a new set of challenges for our industry. We’re rolling out our new AIs slowly and have built in safeguards.
https://www.raspberrypi.com/products/raspberry-pi-5/
The everything computer. Optimised.
With 2–3× the speed of the previous generation, and featuring silicon designed in‑house for the best possible performance, we’ve redefined the Raspberry Pi experience.
Coming October 2023
https://shop.boox.com/products/palma
I really like my Boox e-reader but having a more pocketable device would be amazing. It's unforunate you can't also use it for handwritten notes but at this size it makes sense.
https://blog.research.google/2023/09/distilling-step-by-step-outperforming.html
In “Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes”, presented at ACL2023, we set out to tackle this trade-off between model size and training data collection cost. We introduce distilling step-by-step, a new simple mechanism that allows us to train smaller task-specific models with much less training data than required by standard fine-tuning or distillation approaches that outperform few-shot prompted LLMs’ performance. We demonstrate that the distilling step-by-step mechanism enables a 770M parameter T5 model to outperform the few-shot prompted 540B PaLM model using only 80% of examples in a benchmark dataset, which demonstrates a more than 700x model size reduction with much less training data required by standard approaches.
https://www.bbc.com/future/article/20230912-how-i-hacked-my-brain
There is growing evidence that simple, everyday changes to our lives can alter our brains and change how they work. Melissa Hogenboom put herself into a scanner to find out.
I was surprised that something as simple as mindfulness can play such a crucial role in keeping our minds healthy. Research has shown that mindfulness is a simple but powerful way to enhance several cognitive functions. It can improve attention, relieve pain and reduce stress. Research has found that after only a few months of mindfulness training, certain depression and anxiety symptoms can ease – though as with any complex mental health problem, this may of course vary depending on individual circumstances.
http://josephthacker.com/ai/2023/09/18/vim-llm-hacks.html
... I learned that you could be inside vim, but manipulate the entire file as if you were piping the contents of the file into a command. The output of the command does in-line replacement of the entire file with those changes. That sounds confusing, but it just means you can be inside a vim file and do :%!grep test and it’ll remove all lines that don’t contain test, for example.
This post is a simple showcase of taking that concept, but throwing an llm into the mix to add more dynamic functionality.
https://www.youtube.com/watch?v=bskEGP0r3hE
The future is AVX10, so says Intel. Recently a document was released showcasing a post-AVX512 world, and to explain why this matters, I've again invited the Chips And Cheese crew onto the channel. Chester and George answer my questions on AVX10 and why it matters! Visit http://www.chipsandcheese.com to learn more!
https://www.nps.gov/katm/learn/fat-bear-week.htm
Fat Bear Week - an annual celebration of success. All bears are winners but only one true champion will emerge. Held over the course of seven days and concluding on the Fat Bear Tuesday, people chose which bear to crown in this tournament style bracket where bears are pitted against each other for your vote.
https://openai.com/blog/chatgpt-can-now-see-hear-and-speak
We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about.
https://www.aboutamazon.com/news/company-news/amazon-aws-anthropic-ai
Anthropic selects AWS as its primary cloud provider and will train and deploy its future foundation models on AWS Trainium and Inferentia chips, taking advantage of AWS’s high-performance, low-cost machine learning accelerators.
Happy 20th anniversary! Also, thanks for the generous linking. Lots of new folks to read and subscribe to.
DALL·E 3 understands significantly more nuance and detail than our previous systems, allowing you to easily translate your ideas into exceptionally accurate images.
https://matrix.org/blog/2023/09/matrix-2-0/
TL;DR: If you want to play with a shiny new Matrix 2.0 client, head over to Element X.
Sponsored post, but it's still a good list with guidance and suggestions for common questions.
https://huggingface.co/blog/optimize-llm
In this blog post, we will go over the most effective techniques at the time of writing this blog post to tackle these challenges for efficient LLM deployment:
- Lower Precision: Research has shown that operating at reduced numerical precision, namely 8-bit and 4-bit, can achieve computational advantages without a considerable decline in model performance.
2. Flash Attention: Flash Attention is a variation of the attention algorithm that not only provides a more memory-efficient approach but also realizes increased efficiency due to optimized GPU memory utilization.
3. Architectural Innovations: Considering that LLMs are always deployed in the same way during inference, namely autoregressive text generation with a long input context, specialized model architectures have been proposed that allow for more efficient inference. The most important advancement in model architectures hereby are Alibi, Rotary embeddings, Multi-Query Attention (MQA) and Grouped-Query-Attention (GQA).
https://blog.minch.co/2022/11/15/software-squared.html
A new generation of AIs that become increasingly general by producing their own training data
We are currently at the cusp of transitioning from “learning from data” to “learning what data to learn from” as the central focus of AI research.
If deep learning can be described as “Software 2.0”—software that programs itself based on example inputs/output pairs, then this promising, data-centric paradigm, in which software effectively improves itself by searching for its own training data, can be described as a kind of “Software²”. This paradigm inherits the benefits of Software 2.0 while improving on its core, data-bound weaknesses: While deep learning (Software 2.0) requires the programmer to manually provide training data for each new task, Software² recasts data as software that models or searches the world to produce its own, potentially unlimited, training tasks and data.
https://github.com/OpenRobotLab/PointLLM
We introduce PointLLM, a multi-modal large language model capable of understanding colored point clouds of objects. It perceives object types, geometric structures, and appearance without concerns for ambiguous depth, occlusion, or viewpoint dependency. We collect a novel dataset comprising 660K simple and 70K complex point-text instruction pairs to enable a two-stage training strategy. To rigorously evaluate our model's perceptual abilities and its generalization capabilities, we establish two benchmarks: Generative 3D Object Classification and 3D Object Captioning, assessed through three different evaluation methods.
https://a16z.com/how-are-consumers-using-generative-ai/
1. Most leading products are built from the “ground up” around generative AI
Like ChatGPT, the majority of products on this list didn’t exist a year ago—80% of these websites are new Of the 50 companies on the list, only 5 are products of, or acquisitions by, pre-existing big tech companies... Of the remaining list members, a whopping 48% are completely bootstrapped, with no outside funding, according to PitchBook data.
2. ChatGPT has a massive lead, for now…
ChatGPT represents 60% of monthly traffic to the entire top 50 list, with an estimated 1.6 billion monthly visits and 200 million monthly users (as of June 2023). This makes ChatGPT the 24th most visited website globally.
3. LLM assistants (like ChatGPT) are dominant, but companionship and creative tools are on the rise
General LLM chatbots represent 68% of total consumer traffic to the top 50 list. However, two other categories have started to drive significant usage in recent months—AI companions (such as CharacterAI) and content generation tools (such as Midjourney and ElevenLabs). Within the broader content generation category, image generation is the top use case with 41% of traffic, followed by prosumer writing tools at 26%, and video generation at 8%. Another category worth mentioning? Model hubs. There are only 2 on the list, but they drive significant traffic—Civitai (for images) and Hugging Face both rank in the top 10. This is especially impressive because consumers are typically visiting these sites to download models to run locally, so web traffic is likely an underestimate of actual usage.
4. Early “winners” have emerged, but most product categories are up for grabs
Good news for builders: despite the surge in interest in generative AI, in many categories there is not yet a runway success.
5. Acquisition for top products is entirely organic—and consumers are willing to pay!
The majority of companies on this list have no paid marketing (at least, that SimilarWeb is able to attribute). There is significant free traffic “available” via X, Reddit, Discord, and email, as well as word of mouth and referral growth. And consumers are willing to pay for GenAI. 90% of companies on the list are already monetizing, nearly all of them via a subscription model. The average product on the list makes $21/month (for users on monthly plans)—yielding $252 annually.
6. Mobile apps are still emerging as a GenAI platform
Consumer AI products have, thus far, been largely browser-first, rather than app-first. Even ChatGPT took 6 months to launch a mobile app! Why aren’t more AI companies building on mobile? The browser is a natural starting place to reach the broadest base of consumers. Many AI companies have small teams and likely don’t want to fragment their focus and resources across Web, iOS, and Android. Given that the average consumer now spends 36 minutes more per day on mobile than desktop (4.1 hours vs. 3.5 hours), we expect to see more mobile-first GenAI products emerge as the technology matures.
https://wordpress.org/plugins/activitypub/
I can't believe I missed this announcement. This is great to see!
Enter the fediverse with ActivityPub, broadcasting your blog to a wider audience! Attract followers, deliver updates, and receive comments from a diverse user base of ActivityPub-compliant platforms.
With the ActivityPub plugin installed, your WordPress blog itself function as a federated profile, along with profiles for each author. For instance, if your website is example.com, then the blog-wide profile can be found at @example.com@example.com, and authors like Jane and Bob would have their individual profiles at @jane@example.com and @bobz@example.com, respectively.
https://huggingface.co/spaces/coqui/xtts
XTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 3-second audio clip.
XTTS is built on previous research, like Tortoise, with additional architectural innovations and training to make cross-language voice cloning and multilingual speech generation possible.
https://huggingface.co/blog/wuerstchen
Würstchen is a diffusion model, whose text-conditional component works in a highly compressed latent space of images. Why is this important? Compressing data can reduce computational costs for both training and inference by orders of magnitude. Training on 1024×1024 images is way more expensive than training on 32×32. Usually, other works make use of a relatively small compression, in the range of 4x - 8x spatial compression. Würstchen takes this to an extreme. Through its novel design, it achieves a 42x spatial compression! This had never been seen before, because common methods fail to faithfully reconstruct detailed images after 16x spatial compression. Würstchen employs a two-stage compression, what we call Stage A and Stage B. Stage A is a VQGAN, and Stage B is a Diffusion Autoencoder (more details can be found in the paper). Together Stage A and B are called the Decoder, because they decode the compressed images back into pixel space. A third model, Stage C, is learned in that highly compressed latent space. This training requires fractions of the compute used for current top-performing models, while also allowing cheaper and faster inference. We refer to Stage C as the Prior.
https://huggingface.co/blog/t2i-sdxl-adapters
T2I-Adapter is an efficient plug-and-play model that provides extra guidance to pre-trained text-to-image models while freezing the original large text-to-image models. T2I-Adapter aligns internal knowledge in T2I models with external control signals. We can train various adapters according to different conditions and achieve rich control and editing effects.
Over the past few weeks, the Diffusers team and the T2I-Adapter authors have been collaborating to bring the support of T2I-Adapters for Stable Diffusion XL (SDXL) in diffusers. In this blog post, we share our findings from training T2I-Adapters on SDXL from scratch, some appealing results, and, of course, the T2I-Adapter checkpoints on various conditionings (sketch, canny, lineart, depth, and openpose)!
https://huggingface.co/blog/falcon-180b
Today, we're excited to welcome TII's Falcon 180B to HuggingFace! Falcon 180B sets a new state-of-the-art for open models. It is the largest openly available language model, with 180 billion parameters, and was trained on a massive 3.5 trillion tokens using TII's RefinedWeb dataset. This represents the longest single-epoch pretraining for an open model.
https://retool.com/visual-basic/
How Visual Basic became the world's most dominant programming environment, its sudden fall from grace, and why its influence is still shaping the future of software development.
https://www.modular.com/blog/mojo-its-finally-here
Today, we’re excited to announce the next big step in Mojo’s evolution: Mojo is now available for local download – beginning with Linux systems, and adding Mac and Windows in coming releases.
https://github.com/Textualize/textual-web
Textual Web publishes Textual apps and terminals on the web.
https://www.dreadcentral.com/the-overlook-motel/
I'm a fan of finding hidden gems and overlooked films, especially when it comes to horror. It's how I came across some of my favorites like Hell House and Terrifier. That's why I was excited to run into The Overlook Motel series from Dread Central which spotlights these kinds of films.
https://mullvad.net/en/blog/2023/9/7/tailscale-has-partnered-with-mullvad/
Today we announce a partnership with Tailscale that allows you to use both in conjunction through the Tailscale app. This functionality is not available through the Mullvad VPN app.
This partnership allows customers of Tailscale to make use of our WireGuard VPN servers as “exit nodes”. This means that whilst connected to Tailscale, you can access your devices across Tailscale’s mesh network, whilst still connecting outbound through Mullvad VPN WireGuard servers in any location.
https://www.jwz.org/blog/2023/09/platos-cave-regrets-to-inform-you-it-will-be-raising-its-rent/
If you are receiving this letter, it means you have been designated a tenant of the cave—i.e., you are chained to the wall, you are forced to watch shadows for all eternity, you are projecting said shadow puppets, and/or you are a philosopher who was able to break free and understand the true shackles of reality (PhD candidates about to argue their thesis).
We do not undertake this lightly. As the costs of maintaining a cave meant to trap you in your ignorance increases year after year, we want you to know, from the bottom of our hearts, that we, too, are suffering. We get that times are tough, and we hope you can extend that sympathy to us, the managers of your cave.
Please rest assured that cave costs are increasing everywhere. We manage many other caves like that of Polyphemus the Cyclops and the childhood home of Zeus. So, trust us: we know caves.
We hope you will continue to enjoy living in our cave. We believe you are a valued part of the Plato's Cave community. Credit, cash, or Venmo all work.
Original Source: https://www.mcsweeneys.net/articles/platos-cave-regrets-to-inform-you-it-will-be-raising-its-rent
https://www.aaron-powell.com/posts/2023-09-04-generative-ai-and-dotnet---part-2-sdk/
It’s time to have a look at how we can build the basics of an application using Azure OpenAI Services and the .NET SDK.
https://community.torproject.org/onion-services/setup/
This guide shows you how to set up an Onion Service for your website. For the technical details of how the Onion Service protocol works, see our Onion Service protocol page.
https://spectrum.ieee.org/doctorow-interoperability
In his new book The Internet Con: How to Seize the Means of Computation, author Cory Doctorow presents a strong case for disrupting Big Tech. While the dominance of Internet platforms like Twitter, Facebook, Instagram, or Amazon is often taken for granted, Doctorow argues that these walled gardens are fenced in by legal structures, not feats of engineering. Doctorow proposes forcing interoperability—any given platform’s ability to interact with another—as a way to break down those walls and to make the Internet freer and more democratic.
https://www.microsoft.com/en-us/research/blog/rethinking-trust-in-direct-messages-in-the-ai-era/
This blog post is a part of a series exploring our research in privacy, security, and cryptography. For the previous post, see https://www.microsoft.com/en-us/research/blog/research-trends-in-privacy-security-and-cryptography. While AI has the potential to massively increase productivity, this power can be used equally well for malicious purposes, for example, to automate the creation of sophisticated scam messages. In this post, we explore threats AI can pose for online communication ecosystems and outline a high-level approach to mitigating these threats.
https://perplexity.vercel.app/
I built this little tool to help me understand what it's like to be an autoregressive language model. For any given passage of text, it augments the original text with highlights and annotations that tell me how "surprising" each token is to the model, and which other tokens the model thought were most likely to occur in its place. Right now, the LM I'm using is the smallest version of GPT-2, with 124M parameters.
https://www.fast.ai/posts/2023-09-04-learning-jumps/
Summary: recently while fine-tuning a large language model (LLM) on multiple-choice science exam questions, we observed some highly unusual training loss curves. In particular, it appeared the model was able to rapidly memorize examples from the dataset after seeing them just once. This astonishing feat contradicts most prior wisdom about neural network sample efficiency. Intrigued by this result, we conducted a series of experiments to validate and better understand this phenomenon. It’s early days, but the experiments support the hypothesis that the models are able to rapidly remember inputs. This might mean we have to re-think how we train and use LLMs.
https://openai.com/blog/teaching-with-ai
We’re releasing a guide for teachers using ChatGPT in their classroom—including suggested prompts, an explanation of how ChatGPT works and its limitations, the efficacy of AI detectors, and bias.
Solo by Pixelfed
We've been secretly building a single user federated photo sharing server (based on Pixelfed), with minimal setup, and built-in import
Solo is simple, download the code, drag your photos to the media directory and open your browser
So simple, yet super smart, Solo won't get in your way, but it will impress
Launching Oct 2023
Very excited about this!
https://www.aaron-powell.com/posts/2023-09-01-generative-ai-and-dotnet---part-1-intro/
over this series I’m going to share my learnings on the APIs, SDKs, and the like. The goal here isn’t to “build something” but rather to share what I’ve learnt, the mistakes I’ve made, the things I’ve found confusing, and the code I’ve had to rewrite umpteen times because “oh, that’s a better way to do it”.
Llama 2 7B/13B are now available in Web LLM!
Llama 2 70B is also supported.
This project brings large-language model and LLM-based chatbot to web browsers. Everything runs inside the browser with no server support and accelerated with WebGPU. This opens up a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration.
https://a16z.com/2023/08/30/supporting-the-open-source-ai-community/
We believe artificial intelligence has the power to save the world—and that a thriving open source ecosystem is essential to building this future.
To help close this resource gap, we’re announcing today the a16z Open Source AI Grant program. We’ll support a small group of open source developers through grant funding (not an investment or SAFE note), giving them the opportunity to continue their work without the pressure to generate financial returns.
https://www.reality2cast.com/151
Good discussion on the open web, self-hosting, radio, and the indie web.
https://simonwillison.net/2023/Aug/27/wordcamp-llms/
My goal today is to provide practical, actionable advice for getting the most out of Large Language Models—both for personal productivity but also as a platform that you can use to build things that you couldn’t build before.
https://www.newyorker.com/culture/cultural-comment/we-dont-need-a-new-twitter
If Meta can succeed in capturing some of this peak-Twitter magic, while avoiding late-stage Twitter’s struggles, the company will perhaps even reclaim some of the cultural gravity that it squandered a decade ago when Facebook took its turn toward crazy-uncle irrelevance. But can Meta possibly succeed in building a saner, nicer Twitter?
Breaking news can spread quickly, as can clips that are funny in an original or strange way—but these innocuous trends feel serendipitous, like a rainbow spanning storm clouds. To reach the Twitter masses, conspiracy, demagoguery, and cancellation are much more likely to succeed. The result is a Faustian bargain for our networked era: trusting the wisdom of crowds to identify what’s interesting can create an intensely compelling stream of shared content, but this content is likely to arrive drenched in rancor.
The obvious way Meta can attempt to escape this bargain is by moving Threads away from retransmission-based curation and toward algorithmic ranking. This will give the company more control over which discussions are amplified, but, in doing so, they will also lose the human-powered selectivity that makes Twitter so engaging.
If we look past this narrow discussion of Threads’ challenges, however, a broader question arises: Why is it so important to create a better version of Twitter in the first place? Ignored amid the hand-wringing about the toxic turn taken by large-scale conversation platforms are the many smaller, less flashy sites and services that have long been supporting a more civilized form of digital interaction.
“The Internet has become the ultimate narrowcasting vehicle: everyone from UFO buffs to New York Yankee fans has a Website (or dozen) to call his own,” the journalist Richard Zoglin wrote in 1996. “A dot-com in every pot.”
We’ve gone from Zoglin’s dot-com in every pot to the social-media age’s vision of every pot being filled with slop from the same platforms.
https://studio.ribbonfarm.com/p/against-waldenponding
Waldenponding (after Thoreau's Walden Pond experiment on which Walden is based). The crude caricature is "smash your smart phone and go live in a log cabin to reclaim your attention and your life from being hacked by evil social media platforms."
https://tomcritchlow.com/2022/04/21/new-rss/
RSS is kind of an invisible technology. People call RSS dead because you can’t see it. There’s no feed, no login, no analytics. RSS feels subsurface.
Come to think of it - all the interesting bits of blogging are invisible. The discussion has moved to Twitter, or discords, or DMs. Trackbacks aren’t a thing anymore. So when you see someone blogging all you see is the blog post. The branching replies and conversations are either invisible or hard to track down.
But I believe we’re living in a golden age of RSS. Blogging is booming. My feed reader has 280 feeds in it.
How do we increase the surface area of RSS and blogging?
I think there’s something quietly radical about making your feed reader open by default. It increases the surface area of RSS so others can discover content more easily. It makes blogging more visible.
The nice thing about RSS and OPML is that’s a very extensible spec. The file format is flexible, you can define your own schema and fields. This might open up new kinds of publishing.
About Feeds is a free site from Matt Webb. I made this site because using web feeds for the first time is hard, and we can fix that.
https://www.edge.org/conversation/marvin_minsky-consciousness-is-a-big-suitcase
Marvin Minsky is the leading light of AI—artificial intelligence, that is. He sees the brain as a myriad of structures. Scientists who, like Minsky, take the strong AI view believe that a computer model of the brain will be able to explain what we know of the brain's cognitive abilities. Minsky identifies consciousness with high-level, abstract thought, and believes that in principle machines can do everything a conscious human being can do.
A graphical Unix-like operating system for desktop computers!
SerenityOS is a love letter to '90s user interfaces with a custom Unix-like core. It flatters with sincerity by stealing beautiful ideas from various other systems.
Roughly speaking, the goal is a marriage between the aesthetic of late-1990s productivity software and the power-user accessibility of late-2000s *nix.
This is a system by us, for us, based on the things we like.
https://ai.meta.com/blog/code-llama-large-language-model-coding/
Takeaways
- Code Llama is a state-of-the-art LLM capable of generating code, and natural language about code, from both code and natural language prompts.
- Code Llama is free for research and commercial use.
- Code Llama is built on top of Llama 2 and is available in three models:
- Code Llama, the foundational code model;
- Codel Llama - Python specialized for Python;
- and Code Llama - Instruct, which is fine-tuned for understanding natural language instructions.
- In our own benchmark testing, Code Llama outperformed state-of-the-art publicly available LLMs on code tasks
Today we’re announcing a significant evolution in the analytical capabilities available within Excel by releasing a Public Preview of Python in Excel. Python in Excel makes it possible to natively combine Python and Excel analytics within the same workbook - with no setup required. With Python in Excel, you can type Python directly into a cell, the Python calculations run in the Microsoft Cloud, and your results are returned to the worksheet, including plots and visualizations.
https://www.deeplearning.ai/short-courses/large-language-models-semantic-search/
Keyword search has been a common method for search for many years. But for content-rich websites like news media sites or online shopping platforms, the keyword search capability can be limiting. Incorporating large language models (LLMs) into your search can significantly enhance the user experience by allowing them to ask questions and find information in a much easier way.
This course teaches the techniques needed to leverage LLMs into search.
Throughout the lessons, you’ll explore key concepts like dense retrieval, which elevates the relevance of retrieved information, leading to improved search results beyond traditional keyword search, and reranking, which injects the intelligence of LLMs into your search system, making it faster and more effective.
https://eugeneyan.com/writing/llm-patterns/
There are seven key patterns. They’re also organized along the spectrum of improving performance vs. reducing cost/risk, and closer to the data vs. closer to the user.
- Evals: To measure performance
- RAG: To add recent, external knowledge
- Fine-tuning: To get better at specific tasks
- Caching: To reduce latency & cost
- Guardrails: To ensure output quality
- Defensive UX: To anticipate & manage errors gracefully
- Collect user feedback: To build our data flywheel
Addendum: how to match these LLM patterns to potential problems
https://github.com/huggingface/candle
Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use.
https://huyenchip.com/2023/08/16/llm-research-open-challenges.html
- Reduce and measure hallucinations
- Optimize context length and context construction
- Incorporate other data modalities
- Make LLMs faster and cheaper
- Design a new model architecture
- Develop GPU alternatives
- Make agents usable
- Improve learning from human preference
- Improve the efficiency of the chat interface
- Build LLMs for non-English languages
https://www.infoq.com/news/2023/08/jupyter-ai-notebooks/
The open-source Project Jupyter, used by millions for data science and machine learning, has released Jupyter AI, a free tool bringing powerful generative AI capabilities to Jupyter notebooks.
https://jupyter-ai.readthedocs.io/en/latest/
Jupyter AI, which brings generative AI to Jupyter. Jupyter AI provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. More specifically, Jupyter AI offers:
An %%ai magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, VSCode, etc.).
A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant.
Support for a wide range of generative model providers and models (AI21, Anthropic, Cohere, Hugging Face, OpenAI, SageMaker, etc.).
Interesting project.
RetroStrange is part independent media experiment, part handmade exaltation of vintage weirdness and the public domain. It is produced by Noah Maher and Phil Nelson.
https://developer.wordpress.org/playground/
Not a WordPress user but this is cool, especially this - "Build an entire site, save it, host it"
WordPress Playground makes WordPress instantly accessible for users, learners, extenders, and contributors. You can:
- Try a block, a theme, or a plugin
- Build an entire site, save it, host it
- Test your plugin with many specific WordPress and PHP versions
- Embed a real, interactive WordPress site in your tutorial or course
- Showcase a plugin or theme on your website
- Preview pull requests from your repository
- …or even run WordPress locally using the VisualStudio Code plugin or a CLI tool called wp-now
a loose association of like-minded tilde communities.
tildes are pubnixes in the spirit of tilde.club, which was created in 2014 by paul ford.
Public Access UNIX Systems (PAUS) are a type of server that provide various services to a multi-user community. They first began in the early 1980's and continue today. Early servers ran various flavors of UNIX, hence the name Public Access "UNIX" Systems, but later generations saw a large mix of Unix-variants and, of course, GNU/Linux. To recognize the many different operating systems online today, these systems are increasingly referred to generically as "pubnixes". - Pubnix Hist
https://nick-black.com/htp-notcurses.pdf
A TUI (text user interface) is a holistic model, view, and controller implemented using character graphics. TUIs, like WIMP3 GUIs, freely move the cursor around their rectilinear display, as opposed to line-oriented CLIs and their ineluctable marches through the scrolling region. Given the same interactive task
• A TUI implementation is almost certainly a smaller memory and disk footprint than a GUI,
• a good TUI implementation might introduce less latency, and
• a properly-done TUI implementation can often be significantly more portable.
https://the-dam.org/docs/explanations/suc.html
suc provides Slack, Mattermost, etc.’s core features:
Real-time, rich-text chat,
File sharing,
Fine-grained access control,
Straightforward automation and integration with other tools,
Data encryption in transit
and optionally at rest,
state-of-the-art user authentication.This paper shows how suc implements those features. suc stays small by leveraging the consistent and composable primitives offered by modern UNIX implementations
https://creatoreconomy.so/p/kaz-coo-shopify-craft-and-no-meetings
The difference between crafters and managers
The difference is in what you spend time on.
"Most people get satisfaction from building — from actually creating things."
As companies scale, optics start playing a larger role. People start spending more time on internal docs than actually talking to customers. How do you prevent this from happening at Shopify?
In most product reviews, product managers spend way too much time preparing the perfect presentation for execs.
At Shopify, our approach to product reviews is different. We want to see how the product actually works by playing with the demo or diving into the code.
We want our PMs to be extremely user-focused, to take full ownership over problems, and to have a high tolerance for risk.
If these attributes aren't present, product managers tend to become "keepers of strategy.” You end up with smart, highly credentialed individuals spending all their time writing strategy memos to increase their team size so that they can write even more strategy memos.
How Shopify rages against meetings
In early 2023, Shopify initiated operation “Chaos Monkey” to:
- Cancel all meetings with 3+ people
- Reinstate “no meeting Wednesdays”
- Remove needless Slack channels
Does Shopify have a strong writing culture to help people communicate without meetings?
Yes, we try to make async decisions as much as we can. We do this in a few ways:
One of our mantras is “Do things, tell people.” You’ll see this plastered on our walls if you come to Shopify’s office.
We built an operating system called GSD (get shit done). This internal tool emphasizes frequent written updates, which are much easier to digest than constant meetings.
A meeting is a bug that some other process didn’t work out.
We focus on the mission. We want to be the all-in-one commerce platform for people to start and grow businesses. We try to avoid getting distracted by other side quests.
The main thing is to keep the main thing the main thing.
Today we announce the formation of xAI.
The goal of xAI is to understand the true nature of the universe.
https://keras.io/keras_core/announcement/
We're excited to share with you a new library called Keras Core, a preview version of the future of Keras. In Fall 2023, this library will become Keras 3.0. Keras Core is a full rewrite of the Keras codebase that rebases it on top of a modular backend architecture. It makes it possible to run Keras workflows on top of arbitrary frameworks — starting with TensorFlow, JAX, and PyTorch.
Keras Core is also a drop-in replacement for tf.keras, with near-full backwards compatibility with tf.keras code when using the TensorFlow backend. In the vast majority of cases you can just start importing it via import keras_core as keras in place of from tensorflow import keras and your existing code will run with no issue — and generally with slightly improved performance, thanks to XLA compilation.
https://openai.com/blog/gpt-4-api-general-availability
GPT-4 API general availability and deprecation of older models in the Completions API. GPT-3.5 Turbo, DALL·E and Whisper APIs are also generally available, and we are releasing a deprecation plan for older models of the Completions API, which will retire at the beginning of 2024.
The 512KB Club is a collection of performance-focused web pages from across the Internet. To qualify your website must satisfy both of the following requirements:
- It must be an actual site that contains a reasonable amount of information, not just a couple of links on a page (more info here).
- Your total UNCOMPRESSED web resources must not exceed 512KB.
1MB Club is a growing collection of performance-focused web pages weighing less than 1 megabyte.
https://proton.me/blog/proton-pass-launch
We’re happy to announce the global launch of Proton Pass...a password manager, one of the most highly demanded services from the Proton community in our annual surveys since we first launched Proton Mail...
I'm really enjoying the sessions. Kudos to the team who put it together and presenters delivering great content.
If you're interested, check out the stream at http://fsharpconf.com/.
List of resources from the Reclaim Open 2023 Conference.
https://arxiv.org/pdf/2306.02707.pdf
Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimating the small model’s capability as they tend to learn to imitate the style, but not the reasoning process of LFMs. To address these challenges, we develop Orca, a 13-billion parameter model that learns to imitate the reasoning process of LFMs. Orca learns from rich signals from GPT-4 including explanation traces; step-by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT. To promote this progressive learning, we tap into large-scale and diverse imitation data with judicious sampling and selection. Orca surpasses conventional state-of-the-art instruction-tuned models such as Vicuna-13B by more than 100% in complex zero-shot reasoning benchmarks like Big- Bench Hard (BBH) and 42% on AGIEval. Moreover, Orca reaches parity with ChatGPT on the BBH benchmark and shows competitive performance (4 pts gap with optimized system message) in professional and academic examinations like the SAT, LSAT, GRE, and GMAT, both in zero-shot settings without CoT; while trailing behind GPT-4. Our research indicates that learning from step-by-step explanations, whether these are generated by humans or more advanced AI models, is a promising direction to improve model capabilities and skills.
https://gorilla.cs.berkeley.edu/
Large Language Models (LLMs) have seen an impressive wave of advances recently, with models now excelling in a variety of tasks, such as mathematical reasoning and program synthesis. However, their potential to effectively use tools via API calls remains unfulfilled. This is a challenging task even for today's state-of-the-art LLMs such as GPT-4, largely due to their inability to generate accurate input arguments and their tendency to hallucinate the wrong usage of an API call. We release Gorilla, a finetuned LLaMA-based model that surpasses the performance of GPT-4 on writing API calls. When combined with a document retriever, Gorilla demonstrates a strong capability to adapt to test-time document changes, enabling flexible user updates or version changes. It also substantially mitigates the issue of hallucination, commonly encountered when prompting LLMs directly. To evaluate the model's ability, we introduce APIBench, a comprehensive dataset consisting of HuggingFace, TorchHub, and TensorHub APIs. The successful integration of the retrieval system with Gorilla demonstrates the potential for LLMs to use tools more accurately, keep up with frequently updated documentation, and consequently increase the reliability and applicability of their outputs.
https://arxiv.org/abs/2305.10973
...we propose DragGAN, which consists of two main components: 1) a feature-based motion supervision that drives the handle point to move towards the target position, and 2) a new point tracking approach that leverages the discriminative generator features to keep localizing the position of the handle points. Through DragGAN, anyone can deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout of diverse categories such as animals, cars, humans, landscapes, etc. As these manipulations are performed on the learned generative image manifold of a GAN, they tend to produce realistic outputs even for challenging scenarios such as hallucinating occluded content and deforming shapes that consistently follow the object's rigidity. Both qualitative and quantitative comparisons demonstrate the advantage of DragGAN over prior approaches in the tasks of image manipulation and point tracking. We also showcase the manipulation of real images through GAN inversion.
https://huggingface.co/blog/starcoder
StarCoder and StarCoderBase are Large Language Models for Code (Code LLMs) trained on permissively licensed data from GitHub, including from 80+ programming languages, Git commits, GitHub issues, and Jupyter notebooks. Similar to LLaMA, we trained a ~15B parameter model for 1 trillion tokens. We fine-tuned StarCoderBase model for 35B Python tokens, resulting in a new model that we call StarCoder.
https://github.com/microsoft/LoRA
LoRA reduces the number of trainable parameters by learning pairs of rank-decompostion matrices while freezing the original weights. This vastly reduces the storage requirement for large language models adapted to specific tasks and enables efficient task-switching during deployment all without introducing inference latency. LoRA also outperforms several other adaptation methods including adapter, prefix-tuning, and fine-tuning.
https://arxiv.org/abs/2305.02463
We present Shap-E, a conditional generative model for 3D assets. Unlike recent work on 3D generative models which produce a single output representation, Shap-E directly generates the parameters of implicit functions that can be rendered as both textured meshes and neural radiance fields. We train Shap-E in two stages: first, we train an encoder that deterministically maps 3D assets into the parameters of an implicit function; second, we train a conditional diffusion model on outputs of the encoder. When trained on a large dataset of paired 3D and text data, our resulting models are capable of generating complex and diverse 3D assets in a matter of seconds. When compared to Point-E, an explicit generative model over point clouds, Shap-E converges faster and reaches comparable or better sample quality despite modeling a higher-dimensional, multi-representation output space.
https://wordpress.com/blog/2023/06/01/newsletters-paid-subscriptions/
...we’re introducing a big update — the ability to add paid subscriptions and premium content, whatever plan you’re on. Including the Free plan.
Paid subscriptions let your fans support your art, writing, or project directly.
https://ai.facebook.com/blog/multilingual-model-speech-recognition/
The Massively Multilingual Speech (MMS) project expands speech technology from about 100 languages to over 1,000 by building a single multilingual speech recognition model supporting over 1,100 languages (more than 10 times as many as before), language identification models able to identify over 4,000 languages (40 times more than before), pretrained models supporting over 1,400 languages, and text-to-speech models for over 1,100 languages. Our goal is to make it easier for people to access information and to use devices in their preferred language.
https://ai.google/discover/palm2
PaLM 2 is our next generation large language model that builds on Google’s legacy of breakthrough research in machine learning and responsible AI.
It excels at advanced reasoning tasks, including code and math, classification and question answering, translation and multilingual proficiency, and natural language generation better than our previous state-of-the-art LLMs, including PaLM. It can accomplish these tasks because of the way it was built – bringing together compute-optimal scaling, an improved dataset mixture, and model architecture improvements.
PaLM 2 is grounded in Google’s approach to building and deploying AI responsibly. It was evaluated rigorously for its potential harms and biases, capabilities and downstream uses in research and in-product applications. It’s being used in other state-of-the-art models, like Med-PaLM 2 and Sec-PaLM, and is powering generative AI features and tools at Google, like Bard and the PaLM API.
https://huggingface.co/docs/transformers/en/transformers_agents
Transformers Agent...provides a natural language API on top of transformers: we define a set of curated tools and design an agent to interpret natural language and to use these tools. It is extensible by design; we curated some relevant tools, but we’ll show you how the system can be extended easily to use any tool developed by the community.
https://githubnext.com/projects/copilot-for-docs
Whether you’re learning a new library or API or you’ve been using it for years, it can feel like the documentation gets in your way more than it helps. Maybe the tutorials are too basic, or the reference manual is too sketchy, or the relevant information is split across multiple pages full of irrelevant details.
We’re exploring a way to get you the information you need, faster. By surfacing the most relevant content for questions with tailored summaries that help connect the dots, Copilot for docs saves developers from scouring reams of documentation.
https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers/
In ChatGPT Prompt Engineering for Developers, you will learn how to use a large language model (LLM) to quickly build new and powerful applications. Using the OpenAI API, you’ll be able to quickly build capabilities that learn to innovate and create value in ways that were cost-prohibitive, highly technical, or simply impossible before now.
https://ai.facebook.com/blog/imagebind-six-modalities-binding-ai/
...ImageBind, the first AI model capable of binding information from six modalities. The model learns a single embedding, or shared representation space, not just for text, image/video, and audio, but also for sensors that record depth (3D), thermal (infrared radiation), and inertial measurement units (IMU), which calculate motion and position.
https://openaipublic.blob.core.windows.net/neuron-explainer/paper/index.html
This paper applies automation to the problem of scaling an interpretability technique to all the neurons in a large language model. Our hope is that building on this approach of automating interpretability will enable us to comprehensively audit the safety of models before deployment.
Our technique seeks to explain what patterns in text cause a neuron to activate. It consists of three steps:
- Explain the neuron's activations using GPT-4
- Simulate activations using GPT-4, conditioning on the explanation
- Score the explanation by comparing the simulated and real activations
https://huggingface.co/blog/peft
...as models get larger and larger, full fine-tuning becomes infeasible to train on consumer hardware. In addition, storing and deploying fine-tuned models independently for each downstream task becomes very expensive, because fine-tuned models are the same size as the original pretrained model. Parameter-Efficient Fine-tuning (PEFT) approaches are meant to address both problems!
PEFT approaches only fine-tune a small number of (extra) model parameters while freezing most parameters of the pretrained LLMs, thereby greatly decreasing the computational and storage costs. This also overcomes the issues of catastrophic forgetting, a behaviour observed during the full finetuning of LLMs. PEFT approaches have also shown to be better than fine-tuning in the low-data regimes and generalize better to out-of-domain scenarios. It can be applied to various modalities, e.g., image classification and stable diffusion dreambooth.
https://blog.jim-nielsen.com/2023/offline-is-online-with-extreme-latency/
you can think of online/offline as part of the same continuum just different measurements of latency. There are gradations of latency when you’re “online”, and “offline” is merely at the slowest end of that spectrum.
LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.
Instruction tuning large language models (LLMs) using machine-generated instruction-following data has improved zero-shot capabilities on new tasks in the language domain, but the idea is less explored in the multimodal field.
1. Multimodal Instruct Data. We present the first attempt to use language-only GPT-4 to generate multimodal language-image instruction-following data. 2. LLaVA Model. We introduce LLaVA (Large Language-and-Vision Assistant), an end-to-end trained large multimodal model that connects a vision encoder and LLM for general-purpose visual and language understanding. 3. Performance. Our early experiments show that LLaVA demonstrates impressive multimodel chat abilities, sometimes exhibiting the behaviors of multimodal GPT-4 on unseen images/instructions, and yields a 85.1% relative score compared with GPT-4 on a synthetic multimodal instruction-following dataset. When fine-tuned on Science QA, the synergy of LLaVA and GPT-4 achieves a new state-of-the-art accuracy of 92.53%. 4. Open-source. We make GPT-4 generated visual instruction tuning data, our model and code base publicly available.
https://developer.mozilla.org/en-US/docs/Web/API/WebGPU_API
The WebGPU API enables web developers to use the underlying system's GPU (Graphics Processing Unit) to carry out high-performance computations and draw complex images that can be rendered in the browser.
https://github.com/openai/consistency_models
Paper: https://arxiv.org/abs/2303.01469
Diffusion models have made significant breakthroughs in image, audio, and video generation, but they depend on an iterative generation process that causes slow sampling speed and caps their potential for real-time applications. To overcome this limitation, we propose consistency models, a new family of generative models that achieve high sample quality without adversarial training. They support fast one-step generation by design, while still allowing for few-step sampling to trade compute for sample quality. They also support zero-shot data editing, like image inpainting, colorization, and super-resolution, without requiring explicit training on these tasks. Consistency models can be trained either as a way to distill pre-trained diffusion models, or as standalone generative models. Through extensive experiments, we demonstrate that they outperform existing distillation techniques for diffusion models in one- and few-step generation. For example, we achieve the new state-of-the-art FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 for one-step generation. When trained as standalone generative models, consistency models also outperform single-step, non-adversarial generative models on standard benchmarks like CIFAR-10, ImageNet 64x64 and LSUN 256x256.
https://arxiv.org/pdf/2304.03442.pdf
In this paper, we introduce generative agents—computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day. To enable generative agents, we describe an architecture that extends a large language model to store a complete record of the agent’s experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior. We instantiate generative agents to populate an interactive sandbox environment inspired by The Sims, where end users can interact with a small town of twenty five agents using natural language. In an evaluation, these generative agents produce believable individual and emergent social behaviors...
By fusing large language models with computational, interactive agents, this work introduces architectural and interaction patterns for enabling believable simulations of human behavior.
https://bair.berkeley.edu/blog/2023/04/03/koala/
...Koala, a chatbot trained by fine-tuning Meta’s LLaMA on dialogue data gathered from the web.
Many of the most capable LLMs require huge computational resources to train, and oftentimes use large and proprietary datasets. This suggests that in the future, highly capable LLMs will be largely controlled by a small number of organizations, and both users and researchers will pay to interact with these models without direct access to modify and improve them on their own. On the other hand, recent months have also seen the release of increasingly capable freely available or (partially) open-source models, such as LLaMA. These systems typically fall short of the most capable closed models, but their capabilities have been rapidly improving. This presents the community with an important question: will the future see increasingly more consolidation around a handful of closed-source models, or the growth of open models with smaller architectures that approach the performance of their larger but closed-source cousins?
Our results suggest that learning from high-quality datasets can mitigate some of the shortcomings of smaller models, maybe even matching the capabilities of large closed-source models in the future.
https://www.fast.ai/posts/part2-2023.html
From Deep Learning Foundations to Stable Diffusion...is part 2 of Practical Deep Learning for Coders.
In this course, containing over 30 hours of video content, we implement the astounding Stable Diffusion algorithm from scratch!
https://learn.microsoft.com/semantic-kernel/howto/schillacelaws
The "Schillace Laws" were formulated after working with a variety of Large Language Model (LLM) AI systems to date. Knowing them will accelerate your journey into this exciting space of reimagining the future of software engineering.
https://blog.darklang.com/gpt/
On February 1, we stopped working on what we're now calling "darklang-classic", and are fully heads down on building "darklang-gpt", which is the same core Darklang but redesigned to have AI as the primary (or possibly only) way of writing code.
https://laion.ai/blog/open-flamingo/
...OpenFlamingo, an open-source reproduction of DeepMind's Flamingo model. At its core, OpenFlamingo is a framework that enables training and evaluation of large multimodal models (LMMs).
https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/
Cerebras open sources seven GPT-3 models from 111 million to 13 billion parameters. Trained using the Chinchilla formula, these models set new benchmarks for accuracy and compute efficiency.
Today’s release is designed to be used by and reproducible by anyone. All models, weights, and checkpoints are available on Hugging Face and GitHub under the Apache 2.0 license. Additionally, we provide detailed information on our training methods and performance results in our forthcoming paper. The Cerebras CS-2 systems used for training are also available on-demand via Cerebras Model Studio.
https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/
This post only focuses on prompt engineering for autoregressive language models, so nothing with Cloze tests, image generation or multimodality models. At its core, the goal of prompt engineering is about alignment and model steerability.
https://openai.com/product/gpt-4
GPT-4 can accept images as inputs and generate captions, classifications, and analyses.
GPT-4 is capable of handling over 25,000 words of text, allowing for use cases like long form content creation, extended conversations, and document search and analysis.
https://crfm.stanford.edu/2023/03/13/alpaca.html
We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta’s LLaMA 7B model. We train the Alpaca model on 52K instruction-following demonstrations generated in the style of self-instruct using text-davinci-003. Alpaca shows many behaviors similar to OpenAI’s text-davinci-003, but is also surprisingly small and easy/cheap to reproduce.
https://danielmiessler.com/blog/spqa-ai-architecture-replace-existing-software/
AI-based applications will be completely different than those we have today. The new architecture will be a far more elegant, four-component structure based around GPTs: State, Policy, Questions, and Action.
A security program using SPQA
Choose the base model — You start with the latest and greatest overall GPT model from OpenAI, Google, Meta, McKinsey, or whoever. Lots of companies will have one. Let’s call it OpenAI’s GPT-6. It already knows so incredibly much about security, biotech, project management, scheduling, meetings, budgets, incident response, and audit preparedness that you might be able to survive with it alone. But you need more personalized context.
Train your custom model — Then you train your custom model which is based on your own data, which will stack on top of GPT-6. This is all the stuff in the STATE section above. It’s your company’s telemetry and context. Logs. Docs. Finances. Chats. Emails. Meeting transcripts. Everything. It’s a small company and there are compression algorithms as part of the Custom Model Generation (CMG) product we use, so it’s a total of 312TB of data. You train your custom model on that.
Train your policy model — Now you train another model that’s all about your company’s desires. The mission, the goals, your anti-goals, your challenges, your strategies. This is the guidance that comes from humans that we’re using to steer the ACTION part of the architecture. When we ask it to make stuff for us, and build out our plans, it’ll do so using the guardrails captured here in the POLICY.
Tell the system to take the following actions — Now the models are combined. We have GPT-6, stacked with our STATE model, also stacked with our POLICY model, and together they know us better than we know ourselves.
https://simonwillison.net/2023/Mar/11/llama/
The race is on to release the first fully open language model that gives people ChatGPT-like capabilities on their own devices.
https://bitwarden.com/blog/access-your-bitwarden-vault-without-a-password/
Bitwarden expands the Log in with device option that lets you use a second device to authenticate your Bitwarden vault login instead of using your Bitwarden password.
Data-Centric AI (DCAI) is an emerging science that studies techniques to improve datasets, which is often the best way to improve performance in practical ML applications. While good data scientists have long practiced this manually via ad hoc trial/error and intuition, DCAI considers the improvement of data as a systematic engineering discipline.
This is the first-ever course on DCAI. This class covers algorithms to find and fix common issues in ML data and to construct better datasets, concentrating on data used in supervised learning tasks like classification. All material taught in this course is highly practical, focused on impactful aspects of real-world ML applications, rather than mathematical details of how particular models work. You can take this course to learn practical techniques not covered in most ML classes, which will help mitigate the “garbage in, garbage out” problem that plagues many real-world ML applications.
https://karpathy.medium.com/software-2-0-a64152b37c35
The “classical stack” of Software 1.0 is what we’re all familiar with — it is written in languages such as Python, C++, etc...Software 2.0 is written in much more abstract, human unfriendly language, such as the weights of a neural network.
Benefits of Software 2.0
- Computationally homogeneous
- Simple to bake into silicon
- Constant running time
- Constant memory use
- Highly portable
- Agile
- Modules can meld into an optimal whole
- It is better than you
Limitations of Software 2.0
At the end of the optimization we’re left with large networks that work well, but it’s very hard to tell how. Across many applications areas, we’ll be left with a choice of using a 90% accurate model we understand, or 99% accurate model we don’t.
The 2.0 stack can fail in unintuitive and embarrassing ways ,or worse, they can “silently fail”
...the existence of adversarial examples and attacks highlights the unintuitive nature of this stack.
Programming Software 2.0
If you recognize Software 2.0 as a new and emerging programming paradigm instead of simply treating neural networks as a pretty good classifier in the class of machine learning techniques, the extrapolations become more obvious, and it’s clear that there is much more work to do.
In particular, we’ve built up a vast amount of tooling that assists humans in writing 1.0 code, such as powerful IDEs with features like syntax highlighting, debuggers, profilers, go to def, git integration, etc. In the 2.0 stack, the programming is done by accumulating, massaging and cleaning datasets. Who is going to develop the first Software 2.0 IDEs, which help with all of the workflows in accumulating, visualizing, cleaning, labeling, and sourcing datasets?
Github is a very successful home for Software 1.0 code. Is there space for a Software 2.0 Github? In this case repositories are datasets and commits are made up of additions and edits of the labels.
Traditional package managers and related serving infrastructure like pip, conda, docker, etc. help us more easily deploy and compose binaries. How do we effectively deploy, share, import and work with Software 2.0 binaries? What is the conda equivalent for neural networks?
https://huggingface.co/blog/blip-2
This guide introduces BLIP-2 from Salesforce Research that enables a suite of state-of-the-art visual-language models that are now available in 🤗 Transformers. We'll show you how to use it for image captioning, prompted image captioning, visual question-answering, and chat-based prompting.
BLIP-2 bridges the modality gap between vision and language models by adding a lightweight Querying Transformer (Q-Former) between an off-the-shelf frozen pre-trained image encoder and a frozen large language model. Q-Former is the only trainable part of BLIP-2; both the image encoder and language model remain frozen.
https://www.schneier.com/blog/archives/2023/02/attacking-machine-learning-systems.html
At their core, modern ML systems have complex mathematical models that use training data to become competent at a task. And while there are new risks inherent in the ML model, all of that complexity still runs in software. Training data are still stored in memory somewhere. And all of that is on a computer, on a network, and attached to the Internet. Like everything else, these systems will be hacked through vulnerabilities in those more conventional parts of the system.
https://github.com/microsoft/LMOps
LMOps is a research initiative on fundamental research and technology for building AI products w/ foundation models, especially on the general technology for enabling AI capabilities w/ LLMs and Generative AI models.
- Better Prompts: Promptist, Extensible prompts
- Longer Context: Structured prompting, Length-Extrapolatable Transformers
- Knowledge Augmentation (TBA)
- Fundamentals
A few updates I found interesting.
The Beauty of languages: You can now use your own voice during a translated call in Skype.
Universal translator: During a Skype call, if a participant speaks different languages, Skype Translator will automatically detect the languages and translate it for you.
Extra! Extra! Read all about it: Stay up-to-date with the latest news and trends.
Hit me up sometime: You can easily add contacts in Skype on mobile using a unique QR code to get connected.
I have a few thoughts on the Today feature and overall updates Skype has been receiving over the past few months but I'll leave those for another post.
Testing WebmentionFs send integration. From Website.
As someone who spends a significant amount of time in the browser, making Edge, more specifically the browser the platform for applications, this looks appealing. Especially when you consider Progressive Web App (PWA) support and integrations with the Windows Store.
Automattic, the current owner of tumblr, one of the biggest meme mines in the world, has said they’re going to implement ActivityPub, which is the underlying protocol on which Mastodon operates. There are boring ways this could go, and then interesting ways this could go.
Interesting points. I'm looking forward to how this shakes out.
https://www.coursera.org/specializations/mathematics-for-machine-learning-and-data-science
Master the Toolkit of AI and Machine Learning. Mathematics for Machine Learning and Data Science is a beginner-friendly specialization where you’ll learn the fundamental mathematics toolkit of machine learning: calculus, linear algebra, statistics, and probability.
https://blog.jim-nielsen.com/2023/best-time-to-own-a-domain/
if own your domain, create value there, and drive people to it, you’re paying ~$10 a year to build unbounded value over the years — value you control.
That is why owning a domain (and publishing your content there) is like planting a tree: it’s value that starts small and grows. The best time to own a domain and publish your content there was 20 years ago. The second best time is today.
I love a technology like
rel=me
which pushes the idea of domain ownership into broader and broader spheres of society — “How do I get that nice little green checkmark on my profile?” It reinforces the value of owning your own domain (which you can verify elsewhere) and encourages citizens of the internet people to build value in their own little corners of the world wide web.
https://bitwarden.com/blog/bitwarden-open-source-security-explained/
This article answers three common questions about how being open source strengthens Bitwarden security, transparency, and privacy.
http://simonwillison.net/2023/Jan/13/semantic-search-answers/#atom-everything
Here's how to do this:
- Run a text search (or a semantic search, described later) against your documentation to find content that looks like it could be relevant to the user's question.
- Grab extracts of that content and glue them all together into a blob of text.
- Construct a prompt consisting of that text followed by "Given the above content, answer the following question: " and the user's question
- Send the whole thing through the GPT-3 API and see what comes back
https://blog.jim-nielsen.com/2023/subscribe-wherever-you-get-your-content/
Wouldn’t it be amazing if a similar phrase could enter the larger public consciousness for blogs or even people who you follow online?
Instead of:
“Follow me on Twitter.” “I’m @realChuckieCheese on Instagram.” “Subscribe to my videos on YouTube.”
You’d hear some popular influencer saying:
“Follow me wherever you follow people online.” “Find me and subscribe wherever you get your content.”
In a world like this, perhaps apex domains could become the currency of online handles: follow me
@jim-nielsen.com
.
Now the only challenges for me are:
- Should the song on my website match my ringtone?
- Where do I place the flames GIF?
If you're looking for new folks follow, here's a list of blogs on a variety of topics.
https://www.windowscentral.com/hardware/computers-desktops/intel-core-i3-n-series-launch-ces-2023
The article suggests these processors are for the education market. Hopefully a few of the devices with these processors in a Surface Go form-factor are made generally available as well. I think the N200 series is the sweet spot in terms of performance and battery life with only 6W of max turbo power. The i3 which draws a max of 15W may be too much. At that point, some of the newly announced U-series processors might make more sense.
Great post.
For me, I've found POSSE to Twitter and Mastodon using RSS and IFTTT-like services has been a relatively low-effort thing to do. The downside though is any follow-up conversations take place on those platforms. That hasn't been an issue for me yet though so until it is, I'll continue to use my relatively low-effort solution.
I agree with you on likes / reposts. My current implementation of those posts is focused on Webmentions, which means it's virtually meaningless since few people know about let alone integrate Webmentions into their website. I would argue replies tend to fall into that same space as well. I have however found great use in bookmark posts to which I try and add some content from the source document to help me quickly identfy why I thought that post / website was interesting or relevant to a specific topic at that time.
https://github.com/karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs. It's a re-write of minGPT, which I think became too complicated, and which I am hesitant to now touch. Still under active development, currently working to reproduce GPT-2 on OpenWebText dataset. The code itself aims by design to be plain and readable: train.py is a ~300-line boilerplate training loop and model.py a ~300-line GPT model definition, which can optionally load the GPT-2 weights from OpenAI. That's it.
Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
https://www.theverge.com/23513418/bring-back-personal-blogging
In the beginning, there were blogs, and they were the original social web. We built community. We found our people. We wrote personally. We wrote frequently. We self-policed, and we linked to each other so that newbies could discover new and good blogs.
The biggest reason personal blogs need to make a comeback is a simple one: we should all be in control of our own platforms.
People built entire communities around their favorite blogs, and it was a good thing. You could find your people, build your tribe, and discuss the things your collective found important.
Buy that domain name. Carve your space out on the web. Tell your stories, build your community, and talk to your people. It doesn’t have to be big. It doesn’t have to be fancy. You don’t have to reinvent the wheel. It doesn’t need to duplicate any space that already exists on the web — in fact, it shouldn’t. This is your creation. It’s your expression. It should reflect you.
https://openai.com/blog/new-and-improved-embedding-model/
The new model,
text-embedding-ada-002
, replaces five separate models for text search, text similarity, and code search, and outperforms our previous most capable model, Davinci, at most tasks, while being priced 99.8% lower.
https://danielmiessler.com/blog/its-time-to-get-back-into-rss/
We all mourned when Reader died and took RSS with it, but it's time to return to what made it great
Really cool project. Don't really care for the crypto stuff but the rest looks very interesting.
https://matrix.org/blog/2022/12/25/the-matrix-holiday-update-2022
These updates are exciting!
Reddit appears to be building out new Chat functionality using Matrix
Discourse is working on adding Matrix support
Thunderbird launched Matrix support.
Automattic is busy building Matrix plugins for Wordpress
Not as exciting
...only a handful of these initiatives have resulted in funding reaching the core Matrix team. This is directly putting core Matrix development at risk.
In short: folks love the amazing decentralised encrypted comms utopia of Matrix. But organisations also love that they can use it without having to pay anyone to develop or maintain it. This is completely unsustainable, and Element is now literally unable to fund the entirety of the Matrix Foundation on behalf of everyone else - and has had to lay off some of the folks working on the core team as a result.
In the interim, if you are an organisation who’s building on Matrix and you want the project to continue to flourish, please mail funding@matrix.org to discuss how you can support the foundations that you are depending on.
I'm looking forward to the future of the Matrix protocol, especially the P2P components.
https://google.github.io/comprehensive-rust/
This is a four day Rust course developed by the Android team. The course covers the full spectrum of Rust, from basic syntax to advanced topics like generics and error handling. It also includes Android-specific content on the last day.
https://github.com/norvig/pytudes/blob/main/ipynb/AlphaCode.ipynb
Large language models have recently shown an ability to solve a variety of problems. In this notebook we consider programming problems (as solved by AlphaCode) and mathematics problems (as solved by Minerva). The questions we would like to get at are:
- In the future, what role will these generative models play in assisting a programmer or mathematician?
- What will be a workflow that incorporates these models?
- How will other existing tools (such as programming languages) change to accomodate this workflow?
Worked well for me. Thanks for sharing!
Today we're joined by ChatGPT, the latest and coolest large language model developed by OpenAl. In our conversation with ChatGPT, we discuss the background and capabilities of large language models, the potential applications of these models, and some of the technical challenges and open questions in the field. We also explore the role of supervised learning in creating ChatGPT, and the use of PPO in training the model. Finally, we discuss the risks of misuse of large language models, and the best resources for learning more about these models and their applications.
https://github.com/charmbracelet/glow
Render markdown on the CLI
https://github.blog/2022-12-07-github-copilot-is-generally-available-for-businesses/
GitHub Copilot for Business gives organizations:
- The power of AI. Millions of developers have already used GitHub Copilot to build software faster, stay in the flow longer, and solve problems in new ways—all right from their editor of choice.
- Simple license management. Administrators can enable GitHub Copilot for their teams and select which organizations, teams, and developers receive licenses.
- Organization-wide policy management. You can easily set policy controls to enforce user settings for public code matching on behalf of your organization.
- Your code is safe with us. With Copilot for Business, we won’t retain code snippets, store or share your code regardless if the data is from public repositories, private repositories, non-GitHub repositories, or local files.
A container-based approach to boot a full Android system on a regular GNU/Linux system like Ubuntu.
https://dev.to/cognipla/geospatial-is-a-function-of-your-life-1924
Florence, a low-code geospatial library for everyone that ranks city places with (life-inspired) functions.
Florence heavily laverages .NET Interactive/Polyglot Notebooks runtime to hide and execute code behind the scene (including breaking some language constraints).
https://www.infoq.com/news/2022/12/microsoft-farmvibes/
Microsoft Research recently open-sourced FarmVibes.AI, a suite of ML models and tools for sustainable agriculture. FarmVibes.AI includes data processing workflows for fusing multiple sets of spatiotemporal and geospatial data, such as weather data and satellite and drone imagery.
https://medium.com/emacs/using-elfeed-to-view-videos-6dfc798e51e6
Slightly modified the original script to use Streamlink and lower quality to 240p for bandwith and resource purposes.
(require 'elfeed)
(defun elfeed-v-mpv (url)
"Watch a video from URL in MPV"
(async-shell-command (format "streamlink -p mpv %s 240p" url)))
(defun elfeed-view-mpv (&optional use-generic-p)
"Youtube-feed link"
(interactive "P")
(let ((entries (elfeed-search-selected)))
(cl-loop for entry in entries
do (elfeed-untag entry 'unread)
when (elfeed-entry-link entry)
do (elfeed-v-mpv it))
(mapc #'elfeed-search-update-entry entries)
(unless (use-region-p) (forward-line))))
(define-key elfeed-search-mode-map (kbd "v") 'elfeed-view-mpv)
https://openai.com/blog/chatgpt/
We’ve trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. ChatGPT is a sibling model to InstructGPT, which is trained to follow an instruction in a prompt and provide a detailed response.
https://opensource.com/article/22/11/open-source-payphone-philtel
PhilTel is a telephone collective based in Philadelphia, Pennsylvania, focusing on making communications accessible to everyone by installing free-to-use payphones. While you'll be able to make standard telephone calls through our phones, we're also focusing on offering interesting services or experiences. We don't want to only facilitate human-to-human interaction but also human-to-machine interaction and give people an environment where they can explore the telephone network and learn from it.
https://simonwillison.net/2022/Nov/26/productivity/
Consider timezones: engineers in Madrid and engineers in San Francisco had almost no overlap in their working hours. Good asynchronous communication was essential.
Over time, I noticed that the teams that were most effective at this scale were the teams that had a strong culture of documentation and automated testing.
As I started to work on my own array of smaller personal projects, I found that the same discipline that worked for large teams somehow sped me up, when intuitively I would have expected it to slow me down.
https://stability.ai/blog/stable-diffusion-v2-release
The Stable Diffusion 2.0 release includes robust text-to-image models trained using a brand new text encoder (OpenCLIP), developed by LAION with support from Stability AI, which greatly improves the quality of the generated images compared to earlier V1 releases.
Stable Diffusion 2.0 also includes an Upscaler Diffusion model that enhances the resolution of images by a factor of 4.
https://www.tremendous.com/blog/the-perks-of-a-high-documentation-low-meeting-work-culture
the intangible, overarching benefit of practicing meeting mindfulness is this: you spend less of your day sort-of-listening and more of your day really thinking.
Automattic CEO Matt Mullenweg — whose company acquired Tumblr from Verizon in 2019 — suggested...Tumblr... would soon add ...activitypub.
I'm perfectly happy using my site as the main place to post content but considering it's under new management and adopting open standards, it might be time to checkout out Tumblr again.
In the six weeks since announcing that Internet Archive has begun gathering content for the Digital Library of Amateur Radio and Communications (DLARC), the project has quickly grown to more than 25,000 items, including ham radio newsletters, podcasts, videos, books, and catalogs.
https://derw.substack.com/p/telepathic-technical-writing
...blog posts are always async, but they can lead to conversations and debates, once the reader is done reading. There's also the nature of blog posts being one-to-many, whereas chat is many-to-many, if done in a public channel. One-to-many forms of communication should generally be more formal, but can spread ideas and thoughts more coherently than many-to-many.
https://blog.archive.org/2022/11/15/digital-books-wear-out-faster-than-physical-books/
Our paper books have lasted hundreds of years on our shelves and are still readable. Without active maintenance, we will be lucky if our digital books last a decade.
Feta is a Matrix server distribution for the Raspberry Pi 3 and 4.
This looks like a great project to get started with self-hosting and join the Matrix network. Having something similar for the fediverse would be great as well. I've hosted Matrix and Mastodon servers on a Raspberry Pi before so I know it's up to the task.
https://newsletter.danhon.com/archive/4230/
Here’s how it works in practice:
- A news organization (or any organization, let’s just start with news) already asserts ownership of its domain e.g. via its certificate, so we piggyback trust off its domain.
- It stands up a Mastodon or other social server at a standard address. I’d propose follow.washingtonpost.com but there’s a bunch of reasons why you might do something else, see below, and uses the agreed well-known autodiscovery protocol to return the address for its Mastodon server (but I don’t see an entry for activitypub or Mastodon yet).
- It creates accounts for its staff on its Mastodon server. Nobody else gets an account; the public can’t sign up.
What you get:
- Verified accounts. Instead of relying on a third party to “verify” your account by sticking a blue check against your display name, account provenance and “verification” is inherited as a property of the Mastodon server....
- Ease of discovery...all a user would have to do, to find Washington Post accounts to follow, would be to know the washingtonpost.com domain. Autodiscovery would let your Mastodon client point itself to the appropriate server.
Not just news organizations...anyone can set up a Mastodon server...the federation means that “official” accounts become “more official” when their server home is hung off the domain of the actual organization.
You wouldn’t need Twitter (or anyone else, really) to verify that the UK Prime Minister’s account is official, because you’d have following.gov.uk as the Mastodon server, which means you can trust that server as much as you trust .gov.uk domains.
Your university or college wants you to have a social media account? Sure, you can have it hosted at following.ucla.edu.
And yes, brands can get in on it. Sure. That way there’s a tiny chance you’re following the Proper Brand Account rather than a Parody Brand Account, which… is probably for the best. Or it’s easier to see that a Parody Account is a Parody Account because you can look at the parent server.
https://www.karlsutt.com/articles/communicating-effectively-as-a-developer/
Communicating effectively as an engineer means empathically increasing the resolution of your writing.
...“low-resolution writing”...There is very little context, too much reliance on pronouns and unclear references. The writing is not emphatic —the reader has to spend extra energy to work out what is being said
Longer-form writing gives you an opportunity to dive deeper into why you are saying what you are saying. It is a chance to educate, to teach, to help understand and to level up.
The quality of the API documentation will carry an astronomical amount of leverage. This leverage will work in both directions. Genuinely helpful documentation is the difference between being swamped by support requests from frustrated API users and significantly increasing the usage of your service. Happy users beget more happy users.
Spoken words get forgotten. Written words are shared, preserved, and become the basis of a company's culture. source
High resolution, empathic writing...You will have to spend more energy to make your writing easy to follow. You will have to grapple with your own confusion and holes in your understanding. You will have to figure out what the appropriate density for your writing is.
It's not about you, though.. It's about them.
Not only does a single recipient benefit from your extra effort, what if ten people read your good work? A hundred? A thousand? What if the company CEO reads it? Taking writing seriously at work or in your organisation and putting in the effort to delight the reader will, over time, compound into a massive body of quality writing that benefits everyone. It is a literal win-win-win
Produce writing you would read with delight if you were on the other end.
It's also built with .NET 🙂
https://matthiasott.com/notes/suspension
write your most important thoughts on your own site. You can share the link on as many platforms as you like and have conversations with anyone who wants to connect with you and your work. But nobody can take it from you. You are in control. Forever.
https://pytorch.org/blog/PyTorchfoundation/
PyTorch is moving to the Linux Foundation (LF) as a top-level project under the name PyTorch Foundation.
The PyTorch Technical Governance now supports a hierarchical maintainer structure and clear outlining of processes around day to day work and escalations. This doesn’t change how we run things, but it does add discipline and openness that at our scale feels essential and timely.
https://blog.pocketcasts.com/2022/10/19/pocket-casts-mobile-apps-are-now-open-source/
We’ve been eager to take this step since we joined Automattic last year...
Wow. Yet another company I didn't know was owned by Automaticc (parent company of WordPress). So not only do they have a stake in regular blogs and websites, but they also are in the microblogging space with Tumblr. With Pocket Casts they're now into podcasts too. Good for them.
We believe that podcasting can not and should not be controlled by Apple and Spotify, and instead support a diverse ecosystem of third-party clients.
I couldn't agree more. Though to be fair to Apple, in all the years since podcasts have been a thing, despite them being one of the main indices, they didn't make any overt attempts to lock down the ecosystem. It's not until companies like Amazon and Spotify tried to make certain content platform exclusives that the ecosystem has started to feel more closed.
https://www.fast.ai/posts/part2-2022-preview.html
In total, we’re releasing four videos, with around 5.5 hours of content, covering the following topics (the lesson numbers start at “9”, since this is a continuation of Practical Deep Learning for Coders part 1, which had 8 lessons):
- Lesson 9 by Jeremy Howard: How to use Diffusers pipelines; What are the conceptual parts of Stable Diffusion
- Lesson 9A by Jonathan Whitaker: A deep dive into Stable Diffusion concepts and code
- Lesson 9B by Wasim Lorgat and Tanishq Abraham: The math of diffusion
- Lesson 10 by Jeremy Howard: Creating a custom diffusion pipeline; Starting “from the foundations”
https://twitter.com/andrestaltz/status/1582448952057958401
My experience working on SSB (i.e. non-crypto fully decentralized protocol) is suitable for small world communication, and definitely not suitable for big world. Email proves that federation can do big world.
https://maggieappleton.com/garden-history
The conversational feed design of email inboxes, group chats, and InstaTwitBook is fleeting – they're only concerned with self-assertive immediate thoughts that rush by us in a few moments...But streams only surface the Zeitgeisty ideas of the last 24 hours...Gardens present information in a richly linked landscape that grows slowly over time...The garden helps us move away from time-bound streams and into contextual knowledge spaces.
The Six Patterns of Gardening:
- Topography over Timelines - Gardens are organised around contextual relationships and associative links; the concepts and themes within each note determine how it's connected to others.
- Continuous Growth - Gardens are never finished, they're constantly growing, evolving, and changing.
- Imperfection & Learning in Public - Gardens are imperfect by design. They don't hide their rough edges or claim to be a permanent source of truth.
- Playful, Personal, and Experimental - Gardens are non-homogenous by nature. You can plant the same seeds as your neighbour, but you'll always end up with a different arrangement of plants.
- Intercropping & Content Diversity - Gardens are not just a collection of interlinked words...Podcasts, videos, diagrams, illustrations, interactive web animations, academic papers, tweets, rough sketches, and code snippets should all live and grow in the garden.
- Independent Ownership - Gardening is about claiming a small patch of the web for yourself, one you fully own and control... If you give it a bit of forethought, you can build your garden in a way that makes it easy to transfer and adapt. Platforms and technologies will inevitably change. Using old-school, reliable, and widely used web native formats like HTML/CSS is a safe bet.
https://blueskyweb.org/blog/10-18-2022-the-at-protocol
Bluesky was created to build a social protocol. In the spring, we released “ADX,” the very first iteration of the protocol...ADX is now the “Authenticated Transport Protocol"...more simply, the “AT Protocol.”
The “AT Protocol” is a new federated social network
What makes AT Protocol unique:
- Account portability
- Algorithmic choice
- Interoperation
- Performance
https://herbertlui.net/overdue-insights-on-daily-blogging/
Here are a bunch of not-so-obvious lessons I’ve internalized through writing each day:
- Writing can be a starting point, not an ending one
- Write to think
- The power of DIFY: Do it for yourself, don’t think too much about what you want other people to get out of it.
- Think small
- Gain energy
- A lot of books are collections of blog posts
- Small is an unblocker
https://gist.github.com/luisquintanilla/164176ec414e465246d6323aa62b38df
This sample shows how to use a pretrained Bidirectional Attention Flow (BiDAF) ONNX model in ML.NET for answering a query about a given context paragraph.
https://almanac.httparchive.org/en/2022/
The Web Almanac is a comprehensive report on the state of the web, backed by real data and trusted web experts. The 2022 edition is comprised of 23 chapters spanning aspects of page content, user experience, publishing, and distribution.
Readium LCP was developed five years ago to protect digital files from unauthorized distribution. Unlike proprietary platforms, the technology is open to anyone who wants to look inside the codebase and make improvements. It is a promising alternative for libraries and users wanting to avoid the limitations of traditional DRM.
https://berk.es/2022/10/12/blog-comments-on-a-static-site-via-social-networks/
discu.eu, a fantastic service that you provide with an URL and then gives back results from various social networks. Places where that URL is discussed. Hackernews, Reddit, and/or Lobsters. It has an API, that I can call with some JavaScript and then insert in the page.
https://zenodo.org/record/7070952#.Y0m5eX7MK00
This dataset contains detailed data on 42,207 apartments (242,257 rooms) in 3,093 buildings including their geometries, room typology as well as their visual, acoustical, topological and daylight characteristics.
https://arxiv.org/abs/2210.03945
Large language models (LLMs) have shown exceptional performance on a variety of natural language tasks. Yet, their capabilities for HTML understanding...have not been fully explored. We contribute HTML understanding models (fine-tuned LLMs) and an in-depth analysis of their capabilities under three tasks: (i) Semantic Classification of HTML elements, (ii) Description Generation for HTML inputs, and (iii) Autonomous Web Navigation of HTML pages.
Out of the LLMs we evaluate, we show evidence that T5-based models are ideal due to their bidirectional encoder-decoder architecture.
https://www.kaggle.com/kaggle-survey-2022
Development
Python and SQL remain the two most common programming skills for data scientists
VSCode is now used by over 50% of working data scientists
Notebooks are a popular environment as well.
Colab notebooks are the most popular cloud-based Jupyter notebook environment
Makes sense especially since Kaggle is owned by Google.
Machine Learning
Kaggle DS & ML Survey 2022 Scikit-learn is the most popular ML framework while PyTorch has been growing steadily year-over-year
LightGBM, XGBoost are also among the top frameworks.
Transformer architectures are becoming more popular for deep learning models (both image and text data)
Cloud computing
All major cloud computing providers saw strong year over year growth in 2022
Specialized hardware like Tensor Processing Units (TPUs) is gaining initial traction with Kaggle data scientists
Resources
I got serious about consolidating the media my family consumes; I decided to buy blu-ray and DVD copies of all the movies and TV shows we actually cared about, and rip them onto a NAS.
I'm glad I've gone down this path, and become more intentional about my media consumption, especially as companies are deleting already-purchased content from users' media libraries! It's sickening (and, IMHO, should be illegal) they can show a 'buy' button for a DRM'ed digital downloads that the user never actually 'owns'.
Couldn't agree more. Although the focus of this post is on video, you can say the same for music and books.
http://scripting.com/2022/10/01/133834.html?title=docQuixote
I spent great time, energy and money, over many years to create the writing and programming environment I wanted to use and I wanted my peers to use, so we could work together to create species-saving communication tools, and just beauty...
...I read the story of David Bowie's last days, he did something amazing when he knew he had a short time to live. He stepped back and got out of the way. He understood this is no longer his world.
When you're young, you think expansively, and as you get old reality sinks in and your imagination contracts. The horizon gets closer and closer. We don't get to mold the world, we are not gods, no matter how good or generous, smart of ruthless you may be, we all start out young and if we're lucky we get old and then we're gone.
https://www.stateof.ai/2022-report-launch.html
Key takeaways:
- New independent research labs are rapidly open sourcing the closed source output of major labs.
- Safety is gaining awareness among major AI research entities
- The China-US AI research gap has continued to widen
- AI-driven scientific research continues to lead to breakthroughs
https://stevens.netmeister.org/631/
In this course, students will learn to develop complex system-level software in the C programming language while gaining an intimate understanding of the Unix operating system (and all OS that belong to this family, such as Linux, the BSDs, and even Mac OS X) and its programming environment.
Topics covered will include the user/kernel interface, fundamental concepts of Unix, user authentication, basic and advanced I/O, fileystems, signals, process relationships, and interprocess communication. Fundamental concepts of software development and maintenance on Unix systems (development and debugging tools such as "make" and "gdb") will also be covered.
https://arxiv.org/pdf/2210.00108.pdf
...backdoors can be added during compilation, circumventing any safeguards in the data preparation and model training stages.
some backdoors, such as ImpNet, can only be reliably detected at the stage where they are inserted and removing them anywhere else presents a significant challenge.
machine-learning model security requires assurance of provenance along the entire technical pipeline, including the data, model architecture, compiler, and hardware specification.
A bookmark (or linkblog) is a post that is primarily comprised of a URL, often title text from that URL, sometimes optional text describing, tagging, or quoting from its contents.
Bookmarks are useful for saving things to read later or build a recommended reading list.
Both this post as well as Tom MacWright's "How to Blog" resonate. I've been posting more consistently to my microblog feeds. These posts are more informal, but just the aspect of producing something even if it's just a short snippet is gratifying. The informality of it makes the posts significantly shorter but the pace at which I publish content and share ideas is faster.
https://blog.jim-nielsen.com/2019/i-love-rss/
It's always fun to stumble upon these lists and finding more interesting people and websites to follow.
https://archive.org/details/locations-venues-indie-web-camp-berlin-2022
Interesting discussion at about 21:30 on a federated wiki / review aggregator.
https://blog.jim-nielsen.com/2022/other-peoples-websites/
Google+ was Google trying to mimic the walled garden of Facebook — their “how” of extracting value from the people of the internet. But they already had an answer for Facebook’s News Feed in front of them: Blogger / Google Reader, the read / write of the internet.
They provide the tools – Reader, Blogger, Search — we provide the personal websites. The open, accessible, indexable web as The Next Great Social Network.
I've been doing something similar to readlists, except with RSS feeds. I create a custom recipe in Calibre which pulls and aggregates the 50 latest articles for each feed in my recipe. I limit it to only look at articles published in the past day since I do this every evening. Think of it like the evening paper. The result is an EPUB file which I then import into my e-book reader.
A few advantages I've found to doing this are:
- I take a break from the computer.
- Since the e-book reader is offline, I focus on reading the article and don't have the option to click on links and get distracted browsing the internet.
- Since I'm already using the e-book reader, it's easy to transition to reading a book.
https://stellarium.org/release/2022/10/01/stellarium-1.0.html
Can't believe this is the first time I'd heard of this desktop app. Usually using mobile apps like Sky Guide is convenient when on the go, but Stellarium not only seems to have lots of information, but also cross-platform and web-based.
This ☝️. If you know, you know. 🔥
https://fosdem.org/2023/news/2022-09-14-fosdem-2023-dates/
It feels like the 2022 conference was just yesterday. In any case, save the date February 4-5,2023.
https://opensource.com/article/22/9/joplin-interview
Good read. Other than Emacs, Joplin is my go-to notetaking application.
https://web.dev/testing-web-design-color-contrast/
W3C’s Web Accessibility Initiative provides strategies, standards, and resources to ensure that the internet is accessible for as many people as possible. The guidelines that underpin these standards are called the Web Content Accessibility Guidelines, or WCAG.
Color contrast is an important piece of the puzzle for accessibility on the web, and adhering to it makes the web more usable for the greatest number of people in the most varied situations.
Apps to test contrast:
- Pika
- VisBug
- Chrome Dev Tools
https://ai.googleblog.com/2022/09/tensorstore-for-high-performance.html
High-perf, scalable array storage that can be used for scenarios like language models.
https://opensource.com/article/22/9/git-configuration-linux
- Create global configuration
- Set default name
- Set default email address
- Set default branch name
- Set default editor
https://www.robinsloan.com/notes/home-cooked-app/
When you liberate programming from the requirement to be general and professional and scalable, it becomes a different activity altogether, just as cooking at home is really nothing like cooking in a commercial kitchen.
Same goes for websites and self-hosting.
https://xibbon.com/terminal/2022/09/21/welcome-to-la-terminal.html
I'm not an Apple user, but this is cool.
https://larahogan.me/blog/management-resource-library/
Resource on:
Influence & managing up: Enact positive change for yourself, your team, or the whole organization.
Leading through crises: Strengthen your support network, meet your team where they’re at, and weather the tough times.
Cross-functional relationships: Strengthen relationships by creating role clarity and creatively supporting one another.
One-on-ones: Set your teammates up for success during your one-on-one meetings!
Hiring: Build consistent, repeatable, and equitable interviews and onboarding plans.
Meetings: Support participants, hone the content, and nail the meeting goal.
Feedback & performance reviews: Everyone deserves clear, actionable feedback!
Communication & team dynamics: Plan ahead, facilitate well, and create clarity.
Adapting your approach: As your work context and team evolves, your leadership approach will need to evolve, too.
https://weaviate.io/blog/2022/09/Distance-Metrics-in-Vector-Search.html
Metrics covered:
- Cosine
- Dot Product
- L2-Squared
- Manhattan
- Hamming
https://en.wikipedia.org/wiki/1%25_rule
...about 1% of Internet users create content, while 99% are just consumers of that content
http://paulgraham.com/users.html
What have I learned from YC's users, the startups we've funded?
...most startups have the same problems.
...the batch that broke YC was a powerful demonstration of how individualized the process of advising startups has to be.
...founders can be [bad] at realizing what their problems are. Founders will sometimes come in to talk about some problem, and we'll discover another much bigger one in the course of the conversation.
Often founders know what their problems are, but not their relative importance.
Focus is doubly important for early stage startups, because not only do they have a hundred different problems, they don't have anyone to work on them except the founders. If the founders focus on things that don't matter, there's no one focusing on the things that do.
Speed defines startups. Focus enables speed. YC improves focus.
Why are founders uncertain about what to do? Partly because startups almost by definition are doing something new, which means no one knows how to do it yet, or in most cases even what "it" is.
disgruntled Facebook users keep using the service because they don’t want to leave behind their friends, family, communities and customers.
“How to Ditch Facebook Without Losing Your Friends” explains the rationale behind these proposals - and offers a tour of what it would be like to use a federated, interoperable Facebook, from setting up your account to protecting your privacy and taking control of your own community’s moderation policies, overriding the limits and permissions that Facebook has unilaterally imposed on its users.
This tool will help you to create a Firefox profile with the defaults you like.
You select which features you want to enable and disable and in the end you get a download link for a zip-file with your profile template.
20 years and still going strong.
This is my comment 2
https://www.zylstra.org/blog/2022/09/wordpressindieweb-as-the-os-of-the-open-social-web/
...2022 Netherlands WordCamp edition in Arnhem [presentation] on turning all WordPress sites into fully IndieWeb enabled sites. Meaning turning well over a third of the web into the open social web. Outside all the silos.
http://www.paulgraham.com/chameleon.html
When Lisp adopts a new paradigm it not only replicates existing practice, but goes beyond it to become a testbed for advancing the state of the art. Why has Lisp been able to adapt so easily when other languages have not? One reason is that Lisp is a programmable programming language.
https://www.freecodecamp.org/news/new-online-courses/
Subjects:
- Science
- Programming
- Education & Teaching
- Health & Medicine
- Social Sciences
- Business
- Information Security (InfoSec)
- Mathematics
- Computer Science
- Data Science
- Personal Development
- Humanities
- Engineering
- Art & Design
https://www.openbuildinginstitute.org/
The OBI system is open source, collaborative and distributed.
Our focus is on low cost and rapidly-built structures that are modular, ecological, and energy efficient.
https://ideaspace.substack.com/p/metablogging
...internally public blogs written by members of the...squad detailing what they’re working on and thinking about.
https://aeturrell.github.io/coding-for-economists/intro.html
...a guide for economists on what programming is, why it’s useful, and how to do it.
https://github.com/google/haskell-trainings
This repository contains the source for the slides and the exercises used in the Haskell trainings at Google.
https://www.opensourcealternative.to/
Discover 350+ popular open source alternatives to your proprietary SaaS.
Open Library is an initiative of the Internet Archive, a 501(c)(3) non-profit, building a digital library of Internet sites and other cultural artifacts in digital form.
Find feeds for all of your favorite sites and keep up with everything they post!
https://evernote.com/blog/how-to-create-commonplace-with-evernote/
Why commonplace?
Commonplace is an ideal medium for the curation and cultivation of intellectual ideas, thoughts, and knowledge. It’s also a proven and timeless system, tested by folks from a spectrum of backgrounds including authors, professors, and scientists.
- Remember things
- Write to recall
- Understand reading
- Personal reference system
- Filter ideas
- Unleash creativity
Reflecting on your commonplace
The most profound power of your commonplace is being able to thumb through reading and review material. Your commonplace book isn’t just a filing cabinet—it’s an evolving record of your life and observations.
https://www.jayeless.net/wiki/small-web.html
...the kind of web they define themselves against; that kind of bloated, corporate, algorithm-ruled and ad-ridden mess that constitutes the majority of highly-trafficked websites these day.
...for me, the term “Small Web” refers to a couple of main things: independence from tech giants, and websites that are lightweight and high-performance.
Related concepts:
- A late 90s-style, hand-crafted web
- Alternative protocols, like Gemini
- An independent web
This site is dedicated to a community (and a larger movement) about the internet how it's changed. We are creating, discovering and enjoying websites and digital spaces.
https://knightcolumbia.org/events/reimagine-the-internet
A virtual conference exploring what the internet could become over the next decade
Sessions:
- Pioneering Alternative Models for Community on the Internet
- Misinformation, Disinformation, and Media Literacy in a Less-Centralized Social Media Universe
- Interoperability and Alternative Social Media
- Lessons from Experiments in Local Community-Building
- Deplatforming and Innovation
- New Directions in Social Media Research
...masterWiki is the direct adaptation of MasterClass' video courses translated into wikiHow-style how-to guides...
Here’s how to read the post:
- Level 1 — Casual. Read the headlines — figure out the details yourself. Most of this isn’t rocket science.
- Level 2 — Tutorial. Read the steps underneath the headline. I’ve spelled out every step so that you can save your brain power for something else.
- Level 3 — Productivity Nerd. Below the tutorial steps, I’ve included discussion of the behavior design implications. This is for true productivity nerds, i.e. the readers of Better Humans.
Optimize First for Single Tasking
#1. Turn OFF (almost) all notifications
#2. Hide social media slot machines
#3. Hide messaging slot machines
#4. Disable app review requests
#5. Turn on Do Not Disturb
#6. Be strategic about your wallpaper
#7. Turn off Raise to Wake
#8. Add the Screen Time widget
#9. Add Content Restrictions
#10. (Optional) Use Restrictions to turn off Safari
#11. Organize your Apps and Folders alphabetically
Switch to Google Cloud to Work Faster
#12. Choose GMail
#13. Choose Google Calendar
#14. Replace Apple Maps with Google Maps
#15. Install the GBoard keyboard for faster typing
#16. Switch to Google Photos
Install These Apps for Productivity
#17. Use Evernote for all note taking, to-do lists, everything
#18. The Case for Calm as your go-to meditation app
#19. Install the right goal tracker for you
#20. Store all your passwords in a password manager, probably LastPass
#21. Use Numerical as your default calculator
#22. Put the Camera app in your toolbar
#23. Use this Doppler Radar app
#24. Use this Pomodoro app
#25. Use Brain.fm for background noise
Use These Apps and Configurations for Deep Learning
#26. Subscribe to these podcasts
#27. Install the Kindle app but never read it in bed
#28. Use Safari this way
#29. Organize your home screen for deep learning over shallow learning
Use These Apps and Configurations for Longevity
#30. Track steps this way
#31. Prefer Time Restricted Eating Over Calorie Counting
#32. Schedule Night Shift
#33. Set up Medical ID
Make The Finishing Touches with These Configurations
#34. Change Siri to a man
#35. Change your phone’s name
#36. Turn off advertising tracking
#37. Set auto-lock to the maximum time
#38. Set your personal hotspot password to a random three word phrase
#39. Turn on control center everywhere
#40. Turn on Background App Refresh
#41. Delete Garage Band
#42. Develop verbal memory for talking to Siri
#43. Set up these text replacement shortcuts
#44. Set your address
#45. Backup this way
https://klemet.github.io/Workshop-Organization-EN/index.html
Workshop objectives
- Discover and learn to use the basic functions of the following software:
- Joplin
- Zotero
- Nextcloud
- Learn how to use these different software programs together to manage information using the following methods:
- The Zettelkasten method
- The P.A.R.A method
- The inbox method
https://www.fast.ai/2022/07/21/dl-coders-22/
A free course designed for people with some coding experience, who want to learn how to apply deep learning and machine learning to practical problems.
Lessons
- Getting started
- Deployment
- Neural net foundations
- Natural Language (NLP)
- From-scratch Model
- Random forests
- Collaborative filtering and embeddings
- Convolutions (CNNs)