Steve Jobs is Resurrected, Meta Is Translating Unwritten Languages and AI is Running for Office

Researchers continue to push the limits of diffusion models, Steve Jobs returns for a conversation with Joe Rogan and Jack Dorsey’s new decentralized social network is open for beta users. Let’s dive in!

Featured Story

Meta released the first speech-to-speech translation system for languages that are primarily spoken rather than written– starting with Hokkien, a primarily oral language spoken within the Chinese diaspora. Despite the fact that approximately 3,500 languages are largely spoken without a common writing system, AI-powered voice translation has primarily concentrated on written languages. Meta seeks to bridge this gap and “break down language barriers in both the physical world and the metaverse to encourage connection and mutual understanding”.

“For interactions, it will enable people from around the world to communicate with each other more fluidly, making the social graph more interconnected. In addition, using artificial speech translation for content allows you to easily localize content for consumption in multiple languages.” – William Falcon, CEO of Lightning AI

Research Highlights

🎨 Researchers from Google, Technion and Weizmann Institute of Science released Imagic, a text-based real image editor for diffusion models. Imagic claims to be the first method to apply complex (e.g., non-rigid) text-guided semantic edits to a single real image. Most text-conditioned image editing methods are restricted to specific editing types, apply to synthetically generated images, or require multiple input images of a common object. Imagic overcomes these limitations by producing a text embedding that aligns with both the input image and the target text, while fine-tuning the diffusion model to capture the image-specific appearance. This results in being able to transform an image of a sitting dog into an image of a jumping one without any extra inputs.

🪙 Researchers from MIT developed a new reparameterization-based methodology that allows for automatic differentiation (AD) of programs with discrete randomness. Due to the enhanced efficiency provided by gradient-based optimization, automatic differentiation (AD), a method for creating new programs that compute the derivative of an existing program, has become widely used in deep learning and scientific computing. These systems currently face a challenge when dealing with programs that have discrete stochastic behaviors that are controlled by distribution parameters, such as flipping a coin with a probability of being heads. The MIT researchers proposed a new approach that claims to provide a low-variance, unbiased estimator that is just as automated as conventional AD techniques.

🐯 Researchers from Google, Carnegie Mellon and UC Irvine introduced RARR, a model that automatically researches and revises the output of any language model to fix “hallucinations” while providing citations for each sentence. The development of RARR, which stands for Retrofit Attribution using Research and Revision, was motivated by the difficulty that current language models have in producing trustworthy outputs due to a lack of built-in mechanisms for attribution to external evidence. RARR claims to enhance this attribution while otherwise retaining the original input to a far greater degree than previously examined edit models when applied to the output of various state-of-the-art language models on a variety of generating tasks.

ML Engineering Highlights

🎙️ Play.ht, a Dubai-based voice generator, created an entirely AI-generated podcast between Steve Jobs and Joe Rogan. Using fine-tuned language models, the fictional interview covers topics such as Apple’s success, Jobs’ religious beliefs and experience with LSD. The 20-minute conversation is the first episode of a podcast series where listeners can vote on which character they want to hear from next.

🗳️ The Synthetic Party, a new political party in Denmark, has its eyes on the elections in November and bases all of its policies on its AI persona, “Leader Lars”. The bot was developed by the artist collective Computer Lars and the nonprofit art and technology organization MindFuture Foundation. The policies of Danish fringe parties over the last 40 years were used to program Lars, who also collects information via chats with people on Discord. According to party inventor Asker Staunaes, the AI becomes more refined the more people use it and that there are humans on the ballot who are committed to acting as a medium for the AI.

🔥 Pano AI, a California-based company that uses deep learning and computer vision to detect wildfire events in real time is expanding its coverage in Montana. The expansion comes after an effective implementation during last year’s fire season when Pano AI’s software assisted a Montana fire department in promptly locating and containing a wildfire at just 74 acres. Using AI along with satellites and ultra HD panoramic cameras, Pano AI will continue to detect, assess and contain new wildfires before they grow large enough to endanger lives and property.

Open Source Highlights

🗃️ Google Vizier, the company’s internal service for performing black-box optimization, is now open source. Vizier has seen thousands of monthly users on the company’s research and production sides since its debut in 2017. Google says that since its introduction, the tool has completed millions of black box optimization tasks, saving a significant amount of processing and labor resources. Google published the open source Vizier as a standalone Python implementation of the Google Vizier API. Users can optimize their objective function using a user API, while developers can create new optimization methods using a developer API.

🎈Bluesky, a decentralized social network initiative created by former Twitter CEO Jack Dorsey, announced its beta waitlist this week. The social networking platform built with decentralized technologies will provide what the development team refers to as a “federated social network,” which will enable users to manage their own data and communication. As an open-source network, Bluesky will be a fully interoperable network that will work with any system built on top of it.

🍝 The world’s largest pasta producer, Barilla, released the design for an open source device that will help you reduce your meal’s CO₂ emissions by up to 80% through passively cooking your pasta. The device consists of an Arduino Nano 33 BLE, an NTC probe, and a few passive parts housed in a 3D-printed enclosure. When it senses the water is boiling, the device notifies your phone that your pasta is done being passively cooked before pulling it out.

Tutorial of the Week

Databases are a fundamental piece of software services. Users rely on them to guarantee the security and integrity of their data, as well as other pertinent details such as assets or transaction records. Learn how to use Redis in a machine learning application with our tutorial here.

Community Spotlight

Want your work featured? Contact us on Slack or email us at [email protected]

⚡Pooya Mohammadi’s project, a PyTorch implementation of the CRNN model, shows you how to train a license plate recognition model with Lightning. The repo also includes a sample Persian dataset made available by the Amirkabir University of Technology.

⚡ This repo by Moeez Malik contains code that can be used to detect tables in PDF documents by using PyTorch and Lightning. It breaks down information about training networks, the dataset used, checkpoints, logging, and evaluation.

⚡You can now display your Lightning training logs using Rich, a Python library for visualizing information in the terminal, with Perceval Wajsburt’s project. It works in Jupyter notebooks and in a command line, and integrates easily with Lightning.

Community Spotlight

🎓 Attending NeurIPS this year? Interested in transition from academia to industry? Stop by our social, hosted by two of Lightning’s very own team members with experience making this transition. Learn more here.

⚡Join Thomas Chaton, one of our tech leads, as he hosts a workshop on building tailored machine learning apps with Lightning. He’ll be focusing on reusable templates for training, servicing, UI, monitoring, autoscaling, and altering your ML solution. (October 25 at 8am PT)

🎨 If you’ve used social media in the past six months, you’ve likely come across some impressive text-to-image generation machine learning models like DALL-E 2 or Stable Diffusion that can create stunning images in seconds, just from a few words. Join Aniket Maurya, one of our developer advocates, as he hosts a livestream exploring what these diffusion models are, and how you can deploy your own text-to-image generator with them, like we did with Muse.

Don’t Miss the Submission Deadline

AAMAS 2023: The 22nd International Conference on Autonomous Agents and Multiagent Systems. May 29 – June 2, 2023. (London, UK) Paper Submission Deadline: Sat Oct 29 2022 04:59:59 GMT-0700
CVPR 2023: The IEEE/CVF Conference on Computer Vision and Pattern Recognition. Jun 18-22, 2023. (Vancouver, Canada). Paper Submission Deadline: Fri Nov 11 2022 23:59:59 GMT-0800

Upcoming Conferences

IROS 2022: International Conference on Intelligent Robots and Systems. Oct 23-27, 2022 (Kyoto, Japan)
NeurIPS | 2022: Thirty-sixth Conference on Neural Information Processing Systems. Nov 28 – Dec 9. (New Orleans, Louisiana)
PyTorch Conference: Brings together leading academics, researchers and developers from the Machine Learning community to learn more about software releases on PyTorch. Dec 2, 2022 (New Orleans, Louisiana)

Researchers continue to push the limits of diffusion models, Steve Jobs returns for a conversation with Joe Rogan and Jack Dorsey’s new decentralized social network is open for beta users. Let’s dive in!

Featured Story

Research Highlights

ML Engineering Highlights

Open Source Highlights

Tutorial of the Week

Community Spotlight

Community Spotlight

Don’t Miss the Submission Deadline

Upcoming Conferences

More from the Blog

Lightning AI Joins AI Alliance To Advance Open, Safe, Responsible AI

8-bit Quantization with Lightning Fabric

4-Bit Quantization with Lightning Fabric