Groups Similar Look up By Text Browse About



Similar articles
Article Id Title Prob Score Similar Compare
182321 VENTUREBEAT 2020-5-19:
OpenAI’s supercomputer collaboration with Microsoft marks its biggest bet yet on AGI
1.000 Find similar Compare side-by-side
182362 THENEXTWEB 2020-5-19:
Microsoft just built ‘one of the top five most powerful’ supercomputers on the planet
0.902 0.633 Find similar Compare side-by-side
182304 VENTUREBEAT 2020-5-19:
Microsoft’s ZeRO-2 with DeepSpeed trains neural networks with up to 170 billion parameters
0.664 0.618 Find similar Compare side-by-side
182176 VENTUREBEAT 2020-5-20:
Search engines are leveraging AI to improve their language understanding
0.017 0.514 Find similar Compare side-by-side
182306 VENTUREBEAT 2020-5-19:
Microsoft debuts WhiteNoise, an AI toolkit for differential privacy
0.014 0.488 Find similar Compare side-by-side
182318 VENTUREBEAT 2020-5-19:
Microsoft launches Project Bonsai, an AI development platform for industrial systems
0.045 0.478 Find similar Compare side-by-side
181620 VENTUREBEAT 2020-5-15:
Facebook’s voice synthesis AI generates speech in 500 milliseconds
0.455 Find similar Compare side-by-side
181563 VENTUREBEAT 2020-5-17:
Microsoft chief scientist: Humans and AI work better together than alone
0.449 Find similar Compare side-by-side
181580 VENTUREBEAT 2020-5-18:
IBM adds noise to boost AI’s accuracy on analog memory
0.015 0.430 Find similar Compare side-by-side
181610 VENTUREBEAT 2020-5-15:
Researchers release data sets to train coronavirus chatbots
0.406 Find similar Compare side-by-side
181461 THENEXTWEB 2020-5-18:
Everything you need to know about artificial general intelligence
0.013 0.405 Find similar Compare side-by-side
181532 VENTUREBEAT 2020-5-16:
Hugging Face dives into machine translation with release of 1,000 models
0.401 Find similar Compare side-by-side
182324 VENTUREBEAT 2020-5-19:
Microsoft acquires RPA startup Softomotive to bolster Power Automate
0.399 Find similar Compare side-by-side
182269 VENTUREBEAT 2020-5-19:
The AI Show: How Intel built a chip with a sense of smell
0.390 Find similar Compare side-by-side
182309 VENTUREBEAT 2020-5-19:
Microsoft is building Industry Clouds for health care and other fields
0.389 Find similar Compare side-by-side
182330 VENTUREBEAT 2020-5-19:
Nvidia researchers propose technique to transfer AI trained in simulation to the real world
0.373 Find similar Compare side-by-side
181622 VENTUREBEAT 2020-5-18:
DeepMind researchers develop method to efficiently teach robots tasks like grasping
0.354 Find similar Compare side-by-side
182301 VENTUREBEAT 2020-5-19:
Azure Synapse Link anchors Microsoft’s new big data analytics capabilities
0.341 Find similar Compare side-by-side
182240 TECHREPUBLIC 2020-5-21:
Microsoft Azure: Boosting machine learning and virtual desktops with GPUs
0.340 Find similar Compare side-by-side
182359 VENTUREBEAT 2020-5-21:
Nvidia reports $3.08 billion in Q1 2020 revenue, up 39% as AI and cloud soar
0.331 Find similar Compare side-by-side
182328 VENTUREBEAT 2020-5-19:
Facebook details the AI behind its shopping experiences
0.328 Find similar Compare side-by-side
182157 TECHREPUBLIC 2020-5-20:
Build 2020: Azure Stack Hub updated with six new features, including machine-learning tools
0.316 Find similar Compare side-by-side
181628 TECHREPUBLIC 2020-5-15:
Geoprocessing-enabled COVID-19 map aids resource allocation amid pandemic
0.310 Find similar Compare side-by-side
182363 TECHREPUBLIC 2020-5-19:
Microsoft Build 2020: Microsoft's new Cloud for Healthcare lets doctors schedule telemedicine visits in Teams
0.310 Find similar Compare side-by-side
182261 VENTUREBEAT 2020-5-19:
How to watch Microsoft’s Build 2020 stream with 100,000 attendees
0.308 Find similar Compare side-by-side

1

ID: 182321

URL: https://venturebeat.com/2020/05/19/openai-microsoft-azure-supercomputer-ai-model-training/

Date: 2020-05-19

OpenAI’s supercomputer collaboration with Microsoft marks its biggest bet yet on AGI

Roughly a year ago, Microsoft announced it would invest $1 billion in OpenAI to jointly develop new technologies for Microsofts Azure cloud platform and to further extend large-scale AI capabilities that deliver on the promise of artificial general intelligence (AGI). In exchange, OpenAI agreed to license some of its intellectual property to Microsoft, which the company would then commercialize and sell to partners, and to train and run AI models on Azure as OpenAI worked to develop next-generation computing hardware. Today during Microsofts Build 2020 developer conference, the first fruit of the partnership was revealed, in the form of a new supercomputer that Microsoft says was built in collaboration with — and exclusively for — OpenAI on Azure. Microsoft claims its the fifth most powerful machine in the world, compared with the TOP 500, a project that benchmarks and details the 500 top-performing supercomputers. According to the most recent rankings, it slots behind the China National Supercomputer Centers Tianhe-2A and ahead of the Texas Advanced Computer Centers Frontera, meaning it can perform somewhere between 38.7 and 100.7 quadrillion floating point operations per second (i.e., petaflops) at peak. OpenAI has long asserted that immense computational horsepower is a necessary step on the road to AGI, or AI that can learn any task a human can. While luminaries like Mila founder Yoshua Bengio and Facebook VP and chief AI scientist Yann LeCun argue that AGI cant exist, OpenAIs cofounders and backers — among them Greg Brockman, chief scientist Ilya Sutskever, Elon Musk, Reid Hoffman, and former Y Combinator president Sam Altman — believe powerful computers in conjunction with reinforcement learning and other techniques can achieve paradigm-shifting AI advances. The unveiling of the supercomputer represents OpenAIs biggest bet yet on that vision. The new Azure-hosted, OpenAI-co-designed machine contains over 285,000 processor cores, 10,000 graphics cards, and 400 gigabits per second of connectivity for each graphics card server. It was designed to train single massive AI models, which are models that learn from ingesting billions of pages of text from self-published books, instruction manuals, history lessons, human resources guidelines, and other publicly available sources. Examples include a natural language processing (NLP) model from Nvidia that contains 8.3 billion parameters, or configurable variables internal to the model whose values are used in making predictions; Microsofts Turing NLG (17 billion parameters), which achieves state-of-the-art results on a number of language benchmarks; Facebooks recently open-sourced Blender chatbot framework (9.4 billion parameters); and OpenAIs own GPT-2 model (1.5 billion parameters), which generates impressively humanlike text given short prompts. As weve learned more and more about what we need and the different limits of all the components that make up a supercomputer, we were really able to say, If we could design our dream system, what would it look like?' OpenAI CEO Sam Altman said in a statement. And then Microsoft was able to build it. We are seeing that larger-scale systems are an important component in training more powerful models. Studies show that these large models perform well because they can deeply absorb the nuances of language, grammar, knowledge, concepts, and context, enabling them to summarize speeches, moderate content in live gaming chats, parse complex legal documents, and even generate code from scouring GitHub. Microsoft has used its Turing models — which will soon be available in open source — to bolster language understanding across Bing, Office, Dynamics, and its other productivity products. In Bing, the models improved caption generation and question answering by up to 125% in some markets, claims Microsoft. In Office, they ostensibly fueled advances in Words Smart Lookup and Key Insights tools. Outlook uses them for Suggested Replies, which automatically generates possible responses to emails. And in Dynamics 365 Sales Insights, they suggest actions to sellers based on interactions with customers. From a technical standpoint, the large models are superior to their forebears in that theyre self-supervised, meaning they can generate labels from data by exposing relationships between the datas parts — a step believed to be critical to achieving human-level intelligence. Thats as opposed to supervised learning algorithms, which train on human-labeled data sets, and which can be difficult to fine-tune on tasks particular to industries, companies, or topics of interest. The exciting thing about these models is the breadth of the things [theyve] enable[d], Microsoft chief technical officer Kevin Scott said in a statement. This is about being able to do a hundred exciting things in natural language processing at once and a hundred exciting things in computer vision, and when you start to see combinations of these perceptual domains, youre going to have new applications that are hard to even imagine right now. Models like those within the Turing family are a far cry from AGI, but Microsoft says its using the supercomputer to explore large models that can learn in a generalized way across text, images, and video data. So, too, is OpenAI. As MIT Technology Review reported earlier this year, a team within OpenAI called Foresight runs experiments to test how far they can push AI capabilities by training algorithms with increasingly large amounts of data and compute. Separately, according to that same bombshell report, OpenAI is developing a system trained on images, text, and other data using massive computational resources the companys leadership believes is the most promising path toward AGI. Indeed, Brockman and Altman in particular believe AGI will be able to master more fields than any one person, chiefly by identifying complex cross-disciplinary connections that elude human experts. Furthermore, they predict that responsibly deployed AGI — in other words, AGI deployed in close collaboration with researchers in relevant fields, like social science — might help solve longstanding challenges in climate change, health care, and education. Its unclear whether the new supercomputer is powerful enough to achieve anything close to AGI , whatever form it might take; last year, Brockman told the Financial Times that OpenAI expects to spend the whole of Microsofts $1 billion investment by 2025 building a system that can run a human brain-sized AI model. In 2018, OpenAIs own researchers released an analysis showing that from 2012 to 2018, the amount of compute used in the largest AI training runs grew more than 300,000 times with a 3.5-month doubling time, far exceeding the pace of Moores law. Last week and on pace with this, IBM detailed the Neural Computer, which uses hundreds of custom-designed chips to train Atari-playing AI in record time, and Nvidia announced a 5-petaflop server based on its A100 Tensor Core graphics card dubbed the A100. Theres evidence that efficiency improvements might offset the mounting compute requirements. A separate, more recent OpenAI survey found that since 2012, the amount of compute needed to train an AI model to the same performance on classifying images in a popular benchmark (ImageNet) has been decreasing by a factor of two every 16 months. But it remains an open question the extent to which compute contributes to performance compared with novel algorithmic approaches. It should be noted, of course, that OpenAI has achieved remarkable AI gains in gaming and media synthesis with fewer resources at its disposal. On Google Cloud Platform, the companys OpenAI Five system played 180 years worth of games every day on 256 Nvidia Tesla P100 graphics cards and 128,000 processor cores to beat professional players ( and 99.4% of players in public matches) at Valves Dota 2. More recently, the company trained a system on at least 64 Nvidia V100 graphics cards and 920 worker machines with 32 processor cores each to manipulate a Rubiks Cube with a robot hand, albeit with a relatively low success rate. And OpenAIs Jukebox model ran simulations on 896 V100 graphics cards to learn to generate music in any style from scratch, complete with lyrics. Whether the supercomputer turns out to be a small stepping stone or a large leap to AGI, the software tools used to design it potentially open new market opportunities for Microsoft. Through its AI at Scale initiative, the tech giant is making resources available to train large models on Azure AI accelerators and networks in an optimized way. It splits training data into batches that are used to train multiple instances of models across clusters and periodically averaged to produce a single model. These resources include a new version of DeepSpeed, an AI library for Facebooks PyTorch machine learning framework that can train models over 15 times larger and 10 times faster on the same infrastructure, and support for distributed training on the ONNX Runtime. When used with DeepSpeed, distributed training on ONNX enables models across hardware and operating systems to deliver performance improvements of up to 17 times, Microsoft claims. By developing this leading-edge infrastructure for training large AI models, were making all of Azure better, Microsoft chief technical officer Kevin Scott said in a statement. Were building better computers, better distributed systems, better networks, better datacenters. All of this makes the performance and cost and flexibility of the entire Azure cloud better.