Machine Learning Software Engineering Daily 공개
[search 0]

Download the App!

show episodes
 
Loading …
show series
 
Mark Saroufim is the author of an article entitled “Machine Learning: The Great Stagnation”. Mark is a PyTorch Partner Engineer with Facebook AI. He has spent his entire career developing machine learning and artificial intelligence products. Before joining Facebook to do PyTorch engineering with external partners, Mark was a Machine Learning Engin…
 
Arun Kumar is an Assistant Professor in the Department of Computer Science and Engineering and the Halicioglu Data Science Institute at the University of California, San Diego. His primary research interests are in data management and systems for machine learning/artificial intelligence-based data analytics. Systems and ideas based on his research …
 
Application Programming Interfaces (APIs) are interfaces that enable multiple software applications to send and retrieve data from one another. They are commonly used for retrieving, saving, editing, or deleting data from databases, transmitting data between apps, and embedding third-party services into apps. The company BaseTen helps companies bui…
 
Natural Language Processing (NLP) is a branch of artificial intelligence concerned with giving computers the ability to understand text and spoken words. “Understanding” includes intent, sentiment, and what’s important in the message. NLP powers things like voice-operated software, digital assistants, customer service chat bots, and many other acad…
 
Using artificial intelligence and machine learning in a product or database is traditionally difficult because it involves a lot of manual setup, specialized training, and a clear understanding of the various ML models and algorithms. You need to develop the right ML model for your data, train the model, evaluate it, optimize it, analyze it for out…
 
Creation Labs is helping bring Europe 1 step closer to fully autonomous long haul trucking. They have developed an AI Driver Assistance System (AIDAS) that retrofits to any commercial vehicle, starting with VW Crafters and MAN TGE trucks. Their system uses camera hardware mounted to the vehicle to capture video data that is processed with computer …
 
Vectors are the foundational mathematical building blocks of Machine Learning. Machine Learning models must transform input data into vectors to perform their operations, creating what is known as a vector embedding. Since data is not stored in vector form, an ML application must perform significant work to transform data in different formats into …
 
The incredible advances in machine learning research in recent years often take time to propagate out into usage in the field. One reason for this is that such “state-of-the-art” results for machine learning performance rely on the use of handwritten, idiosyncratic optimizations for specific hardware models or operating contexts. When developers ar…
 
Embedded Software Engineering is the practice of building software that controls embedded systems- that is, machines or devices other than standard computers. Embedded systems appear in a variety of applications, from small microcontrollers, to consumer electronics, to large-scale machines such as cars, airplanes, and machine tools. iRobot is a con…
 
Reinforcement learning is a paradigm in machine learning that uses incentives- or “reinforcement”- to drive learning. The learner is conceptualized as an intelligent agent working within a system of rewards and penalties in order to solve a novel problem. The agent is designed to maximize rewards while pursuing a solution by trial-and-error. Progra…
 
Companies can have a negative impact on the environment by outputting excess carbon. Many companies want to reduce their net carbon impact to zero, which can be done by investing in forests. Pachama is a marketplace for forest investments. Pachama uses satellites, imaging, machine learning, and other techniques to determine how much carbon is being…
 
TensorFlow Lite is an open source deep learning framework for on-device inference. TensorFlow Lite was designed to improve the viability of machine learning applications on phones, sensors, and other IoT devices. Pete Warden works on TensorFlow Lite at Google and joins the show to talk about the world of machine learning applications and the necess…
 
Originally published July 30, 2019 “Internet of Things” is a term used to describe the increasing connectivity and intelligence of physical objects within our lives. IoT has manifested within enterprises under the term “Industrial IoT,” as wireless connectivity and machine learning have started to improve devices such as centrifuges, conveyor belts…
 
Originally published April 17, 2019 Drishti is a company focused on improving manufacturing workflows using computer vision. A manufacturing environment consists of assembly lines. A line is composed of sequential stations along that manufacturing line. At each station on the assembly line, a worker performs an operation on the item that is being m…
 
Originally published June 21, 2019 Niantic is the company behind Pokemon Go, an augmented reality game where users walk around in the real world and catch Pokemon which appear on their screen. The idea for augmented reality has existed for a long time. But the technology to bring augmented reality to the mass market has appeared only recently. Impr…
 
Originally published December 9, 2019 Machine learning algorithms have existed for decades. But in the last ten years, several advancements in software and hardware have caused dramatic growth in the viability of applications based on machine learning. Smartphones generate large quantities of data about how humans move through the world. Software-a…
 
Originally published January 25, 2019 When TensorFlow came out of Google, the machine learning community converged around it. TensorFlow is a framework for building machine learning models, but the lifecycle of a machine learning model has a scope that is bigger than just creating a model. Machine learning developers also need to have a testing and…
 
Originally published April 3, 2017 A hedge fund is a collection of investors that make bets on the future. The “hedge” refers to the fact that the investors often try to diversify their strategies so that the direction of their bets are less correlated, and they can be successful in a variety of future scenarios. Engineering-focused hedge funds hav…
 
For several years, we have had the ability to create artificially generated text articles. More recently, audio and video synthesis have been feasible for artificial intelligence. Rosebud is a company that creates animated virtual characters that can speak. Users can generate real or fictional presenters easily with Rosebud. Dzmitry Pletnikau is an…
 
Originally published November 7, 2018 An instruction set defines a low level programming language for moving information throughout a computer. In the early 1970’s, the prevalent instruction set language used a large vocabulary of different instructions. One justification for a large instruction set was that it would give a programmer more freedom …
 
October 1, 2019 The development of self-driving cars is one of the biggest technological changes that is under way. Across the world, thousands of engineers are working on developing self-driving cars. Although it still seems far away, self-driving cars are starting to feel like an inevitability. This is especially true if you spend much time in do…
 
Newer machine learning tooling is often focused on streamlining the workflows and developer experience. One such tool is BentoML. BentoML is a workflow that allows data scientists and developers to ship models more effectively. Chaoyu Yang is the creator of BentoML and he joins the show to talk about why he created Bento and the engineering behind …
 
Data labeling is a major bottleneck in training and deploying machine learning and especially NLP. But new tools for training models with humans in the loop can drastically reduce how much data is required. Humanloop is a platform for annotating text and training NLP models with much less labelled data. Raza Habib, founder of Humanloop, joins the s…
 
Federated learning is machine learning without a centralized data source. Federated Learning enables mobile phones or edge servers to collaboratively learn a shared prediction model while keeping all the training data on device. Mike Lee Williams is an expert in federated learning, and he joins the show to give an overview of the subject and share …
 
Machine learning models require training data, and training data needs to be labeled. Raw images and text can be labeled using a training data platform like Labelbox. Labelbox is a system of labeling tools that enables a human workforce to create data that is ready to be consumed by machine learning training algorithms. The Labelbox team joins the …
 
Training a computer vision model is not easy. Bottlenecks in the development process make it even harder. Ad hoc code, inconsistent data sets, and other workflow issues hamper the ability to streamline models. Roboflow is a company built to simplify and streamline these model training workflows. Brad Dwyer is a founder of Roboflow and joins the sho…
 
Machine learning models are only as good as the datasets they’re trained on. Aquarium is a system that helps machine learning teams make better models by improving their dataset quality. Model improvement is often made by curating high quality datasets, and Aquarium helps make that a reality. Peter Gao works on Aquarium, and he joins the show to ta…
 
Factories require quality assurance work. That QA work can be accomplished by a robot with a camera together with computer vision. This allows for sophisticated inspection techniques that do not require as much manual effort on the part of a human. Arye Barnehama is a founder of Elementary Robotics, a company that makes these kinds of robots. Arye …
 
Robotic process automation involves the scripting and automation of highly repeatable tasks. RPA tools such as UIPath paved the way for a newer wave of automation, including the Robot Framework, an open source system for RPA. Antti Karjalainen is the CEO of Robocorp, a company that provides an RPA tool suite for developers. Antti joins the show to …
 
Hyperparameters define the strategy for exploring a space in which a machine learning model is being developed. Whereas the parameters of a machine learning model are the actual data coming into a system, the hyperparameters define how those data points are fed into the training process for building a model to be used by an end consumer. A differen…
 
CrowdFlower was a company started in 2007 by Lukas Biewald, an entrepreneur and computer scientist. CrowdFlower solved some of the data labeling problems that were not being solved by Amazon Mechanical Turk. A decade after starting CrowdFlower, the company was sold for several hundred million dollars. Today, data labeling has only grown in volume a…
 
Chatbots are useful for developing well-defined applications such as first-contact customer support, sales, and troubleshooting. But the potential for chatbots is so much greater. Over the last five years, there have been numerous platforms that have arisen to allow for better, more streamlined chatbot creation. Dialogue software enables the creati…
 
Image annotation is necessary for building supervised learning models for computer vision. An image annotation platform streamlines the annotation of these images. Well-known annotation platforms include Scale AI, Amazon Mechanical Turk, and Crowdflower. There are also large consulting-like companies that will annotate images in bulk for you. If yo…
 
Drug trials can lead to new therapeutics and preventative medications being discovered and placed on the market. Unfortunately, these drug trials typically require animal testing. This means animals are killed or harmed as a result of needing to verify that a drug will not kill humans. Animal testing is unavoidable, but the extent to which testing …
 
Netflix runs all of its infrastructure on Amazon Web Services. This includes business logic, data infrastructure, and machine learning. By tightly coupling itself to AWS, Netflix has been able to move faster and have strong defaults about engineering decisions. And today, AWS has such an expanse of services that it can be used as a platform to buil…
 
Developing machine learning models is not easy. From the perspective of the machine learning researcher, there is the iterative process of tuning hyperparameters and selecting relevant features. From the perspective of the operations engineer, there is a handoff from development to production, and the management of GPU clusters to parallelize model…
 
Deepgram is an end-to-end deep learning platform for speech recognition. Unlike the general purpose APIs from Google or Amazon, Deepgram models are custom-trained for each customer. Whether the customer is a call center, a podcasting company, or a sales department, Deepgram can work with them to build something specific to their use case. Sound dat…
 
At a customer service center, thousands of hours of audio are generated. This audio provides a wealth of information to transcribe and analyze. With the additional data of the most successful customer service representatives, machine learning models can be trained to identify which speech patterns are associated with a successful worker. By identif…
 
Originally published October 8, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. Video surveillance impacts human lives every day. On most days, we do not feel the impact of video surveillance. But the effects of video surveillance have tremendous potential. It can be used to solve crimes and find missing children. It can …
 
Originally published June 13, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. Machine learning allows software to improve as that software consumes more data. Machine learning is a tool that every software engineer wants to be able to use. Because machine learning is so broadly applicable, software companies want to make …
 
Originally published January 31, 2019. We are taking a few weeks off. We’ll be back soon with new episodes. Artificial intelligence is reshaping every aspect of our lives, from transportation to agriculture to dating. Someday, we may even create a superintelligence–a computer system that is demonstrably smarter than humans. But there is widespread …
 
Cruise is an autonomous car company with a development cycle that is highly dependent on testing its cars–both in the wild and in simulation. The testing cycle typically requires cars to drive around gathering data, and that data to subsequently be integrated into a simulated system called Matrix. With COVID-19, the ability to run tests in the wild…
 
Machine learning workflows have had a problem for a long time: taking a model from the prototyping step and putting it into production is not an easy task. A data scientist who is developing a model is often working with different tools, or a smaller data set, or different hardware than the environment which that model will be deployed to. This pro…
 
Devices on the edge are becoming more useful with improvements in the machine learning ecosystem. TensorFlow Lite allows machine learning models to run on microcontrollers and other devices with only kilobytes of memory. Microcontrollers are very low-cost, tiny computational devices. They are cheap, and they are everywhere. The low-energy embedded …
 
Chatbots became widely popular around 2016 with the growth of chat platforms like Slack and voice interfaces such as Amazon Alexa. As chatbots came into use, so did the infrastructure that enabled chatbots. NLP APIs and complete chatbot frameworks came out to make it easier for people to build chatbots. The first suite of chatbot frameworks were la…
 
Machine learning models require the use of training data, and that data needs to be labeled. Today, we have high quality data infrastructure tools such as TensorFlow, but we don’t have large high quality data sets. For many applications, the state of the art is to manually label training examples and feed them into the training process. Snorkel is …
 
Descript is a software product for editing podcasts and video. Descript is a deceptively powerful tool, and its software architecture includes novel usage of transcription APIs, text-to-speech, speech-to-text, and other domain-specific machine learning applications. Some of the most popular podcasts and YouTube channels use Descript as their editin…
 
Machine learning applications are widely deployed across the software industry. Most of these applications used supervised learning, a process in which labeled data sets are used to find correlations between the labels and the trends in that underlying data. But supervised learning is only one application of machine learning. Another broad set of m…
 
Machine learning algorithms have existed for decades. But in the last ten years, several advancements in software and hardware have caused dramatic growth in the viability of applications based on machine learning. Smartphones generate large quantities of data about how humans move through the world. Software-as-a-service companies generate data ab…
 
Originally published June 7, 2018 Moore’s Law states that the number of transistors in a dense integrated circuit doubles about every two years. Moore’s Law is less like a “law” and more like an observation or a prediction. Moore’s Law is ending. We can no longer fit an increasing amount of transistors in the same amount of space with a highly pred…
 
Loading …

빠른 참조 가이드

Google login Twitter login Classic login