Select Language:
If you’re looking to add records to CETAS external tables in SQL Server Managed Instance, but are facing restrictions preventing direct DML operations, there is a practical way to handle this. Since you can’t perform inserts directly into CETAS tables, the best approach is to create separate CETAS tables for each year and then combine the data through views.
Here’s what you can do:
Set up individual CETAS tables for each year. Then, create a view that combines all these tables using a UNION statement. This way, you can access a consolidated view of all your data without changing the original tables directly. To keep things simple and efficient in the long run, consider automating the view updates whenever new data is added or existing data changes. This can be done with simple scripts or scheduled jobs to ensure your view always reflects the latest data.
When working with large datasets, especially when appending new records, it’s common to think about merging files. If you are using Parquet files with CETAS, merging them programmatically can be tricky. Initially, you might consider switching to JSON files because merging JSONs can be straightforward with PowerShell scripts. But instead of shifting formats, it’s better to use tools designed to handle big data files efficiently.
Azure Data Factory and Spark are excellent options because they can process and merge large datasets automatically. With these tools, you can handle your Parquet files more smoothly without needing to convert them into other formats. They provide scalable solutions for managing large-scale data operations, making your workflow easier and faster.
I hope these tips help you streamline your data management process. If you have more questions or need further guidance, feel free to ask. Remember to close the discussion by upvoting and accepting it as the solution if you found it helpful!