In today’s world emergence of PaaS services have made end user life easy in building, maintaining and managing infrastructure however selecting the one suitable for need is a tough and challenging task. We often tend to select hybrid cloud solution for our customers thus providing them the cost efficient solutions with cutting edge technologies.
Differences between these two services
Thanks for Reading .Your Suggestions and feedback's are welcome.
The fundamental building block of any company is DATA , without which no organization can think of survival. But to store and analyze this Data is the traditional approach of warehouse is not fit well because of many reasons. It could be increasing cost or infrastructure or over head of management ,but it does not fit well today.
The other alternative we have is Cloud , be it AWS / Azure /Google or any other. Each of these cloud offer different solutions to problems that we have. But fundamental Question remain same , which cloud to use and why.
Take Data analytics itself , For Running ETL jobs both AWS and Azure offer some solutions , but as architect we need to deeply understand the similarity and differences between two , before suggesting that to customer.
I am here highlighting the some fundamentals similarities and differences between two technologies hoping that it might help the individuals who need to make solutions for customers .
Similar Features for two services
Attribute | AWS Glue | Data Factory |
Fully Managed, Server-less ETL engines | Yes | Yes |
Data ingestion as both structured as well as unstructured data. | Yes | Yes |
Auto generation of code | Yes | Yes |
Underlying technology stack: Spark | Yes | Yes |
Trigger type can be manual as well as automatic | Yes | Yes |
Enable you to focus on building business logic and data transformation | Yes | Yes |
Perform data cleaning, transformation and aggregation | Yes | Yes |
Connects to data warehouses. Data lakes? | Yes, Support data to and from Redshift | Yes : Support in and out from SQL DW |
Transparent Pricing | Yes | Yes |
Support SLAs | Yes | Yes |
Ability for customers to add new data sources | Developers can write custom Scala or Python code and import custom libraries and Jar files into Glue ETL jobs to access data sources not natively supported by AWS Glue. | Yes |
Attributes | AWS Glue | Data Factory |
Main Focus of service | ETL, data catalog | ETL |
Database replication | Full table; incremental via change data capture through AWS Database Migration Service (DMS) | Full table; incremental via custom SELECT query |
SaaS sources | None | About 20, with several more in preview |
Compliance, governance, and security certifications | HIPAA, GDPR | HIPAA, GDPR, ISO 27001, |
Data sharing | Yes, within AWS | No |
Vendor lock-in | AWS Glue is strongly tied to the AWS platform. Usage is billed monthly. | Month to month |
Developer tools | Only python and Scala options are available. | REST API, .Net and Python SDKs, PowerShell CLI |
Thanks for Reading .Your Suggestions and feedback's are welcome.