Skip to main Content
Article

AWS focuses on machine learning with new Amazon Personalize and Textract services

Matt Barclay
AWS continues to develop on machine learning as it announces the availability of Amazon Personalize, a service offering AI-powered personalisation techniques, along with Textract, which automatically extracts text and data from tables and forms.

Another significant development sees the launch of Amazon MSK, to help developers build and run highly available, secure and scalable applications based on Apache Kafka.

Let’s take a deeper look at them, starting from Amazon Personalize, a fully-managed service that brings Amazon.com’s AI-powered personalisation techniques to AWS customers, even with no prior machine learning expertise.

The new service trains, tunes and deploys custom, private machine learning models. It also provisions the necessary infrastructure and manages the entire machine learning pipeline, including processing the data, identifying features, selecting algorithms, as well as training, optimising and hosting the results. Customers receive results via an Application Programming Interface (API) and only pay for what they use, with no minimum fees or upfront commitments.

The service can help in developing applications for a wide array of personalisation use cases, including specific product recommendations and individualised search results. For instance, Yamaha, which sells a range of musical instruments and audio products, is already using it to offer customers personalised product suggestions.

Machine learning is also the main focus of Textract, a new service already available that automatically extracts text and data from tables or formsin virtually any document. Once again, no machine learning experience is required.

Many companies use optical character recognition (OCR) software to extract text and data from files. However, traditional OCR technologies often struggle to recognise common layouts like forms and tables, resulting in lengthy and often inaccurate text dumps. 

Textract’s API supports multiple image formats including scans, PDFs and photos. Customers can then load the resultant data into business software, such as spreadsheets, databases and payroll systems, or they can analyse and query the data using Amazon ElasticSearch, Amazon DynamoDB, Amazon Redshift or Amazon Athena.

Usage examples include identifying text and data such as line items and totals from a photographed receipt, or values from a table in a scanned inventory report. It is also capable of recognising a range of document formats, including those specific to financial services, insurance and healthcare, without requiring any customisation or human intervention.

Elsewhere, the new Amazon Managed Streaming for Apache Kafka (Amazon MSK) provides a highly available, secure and compatible data streaming service for Apache Kafka.

The aim is to help businesses looking to use the popular open source distributed streaming platform to avoid the time and expense of setting up, scaling and managing Apache Kafka clusters for capturing and analysing real-time data streams from a range of sources, including IoT devices, website clickstreams, financial systems and database logs.

The new service makes it easy for developers to build and run applications based on Apache Kafka without having to worry about managing the underlying infrastructure. It’s fully compatible with Apache Kafka, meaning customers can easily migrate their on-premises or Amazon Elastic Cloud Compute (Amazon EC2) clusters to Amazon MSK with no code changes.

It also works with several other AWS cloud offerings, such as providing metrics in Amazon CloudWatch. Support should be added soon for AWS CloudFormation to assist in describing and provisioning infrastructure resources in a cloud environment.

Rajesh Sheth, General Manager of Amazon MSK explained the rationale behind the new offering, saying: “Customers who are running Apache Kafka have told us they want to spend less time managing infrastructure and more time building applications based on real-time streaming data”.

“Amazon MSK gives these customers the ability to run Apache Kafka without having to worry about managing the underlying hardware, and it gives them an easy way to integrate their Apache Kafka applications with other AWS services. With Amazon MSK, customers can stand up Apache Kafka clusters in minutes instead of weeks, so they can spend more time focusing on the applications that impact their businesses.”
Browse Related Brands:
Browse Related Topics:

Matt Barclay

Product Director for Cloud

Matt Barclay is Product Director for Cloud at Global Knowledge UK&I. He has many years of industry experience, with a focus on Cloud and Software Development. He works closely with our key vendors such as AWS and Microsoft to help drive success, address our customers' challenges and ensures our offerings are in line with current trends.

Cookie Control toggle icon