Azure sql Bulk insert error
Issue -
Azure sql Bulk insert error
Issue -
Issue - You do not have permissions to list the data using your user account with Azure AD. Click to learn more about authenticating with Azure AD.
Resolution
If you are using AAD Token, this error is telling you that you need to add a role assignment to the user. Please go to Storage account -> Access Control -> Add -> Add role assignment, then add Storage Blob Data Owner to your login account.
Add the user name under the Storage Blob Data Owner
Hi friends,
I got the above error when system restarted and upon reopening of visual studio ssis package ..my saved package was not found.
To resolve this issue
1. Open the file location by right clicking on project solution
2. Right click on package and click on Add package and choose your missing package.
This error occurs if the name of the SSIS package is recently changed
Bingo! Issue fixed
Hi All,
Encountered above error when conditional split was used to implement SCD type 2 in SSIS using Look up
To resolve this use below code
(ISNULL(email) ? "1" : email) != (ISNULL(L_email) ? "1" : L_email) || (ISNULL(phone) ? "1" : phone) != (ISNULL(L_phone) ? "1" : L_phone)
i,e use ISNULL for both source and destination side so that the query does not return any Null values and finally is able to evaluate results as Boolean
Hi Folks!
We get this error often with latest version of SSIS. To resolve this issue, follow below steps
1. Go to Task Manager
2. Details Tab.
3. Locate the process “DtsDebugHost.exe“.
4. Kill this process. There might be multiple instances of this process. Kill all of them.
5. Reexecute SSIS package
Ah ha...issue resolved
Hi Guys,
My team encountered the below error
Power BI - Month is not getting sorted even when Sort by Month Number is applied
Resolution - Check the data type for Month Number. In our case it was of "Text" type. When we changed to "Whole Number" it worked
Bingo!
Hi Folks,
I got the below error while trying to import data fro Azure BLOB
the column " " of the table wasn't found
Resolution - Check if there are any "Removed columns" steps in your Power Query steps. If yes, try to remove it and reload the data. It will work
2nd step - Try to refresh the table which you are trying to load
You can't schedule refresh for this dataset because one or more sources currently don't support refresh
Downloaded the report and opened in Power BI desktop and then click on report – Refresh happens
Published this report and replaced the old report
|
Azure Analysis Services |
Azure Synapse |
Upgraded version |
Azure Analysis Service is upgraded
version of SQL server Analysis Service on the cloud. |
Azure Synapse Analytic is upgraded
version SQL DW + ML +Power BI that intergrates with ADLS + Data Bricks. |
Concurrent users |
More than 1000 |
128 |
DAX query support |
Yes |
No |
Semantic model |
Yes |
Yes – Through Power BI model in
Synapse Analytics |
Dashboard development |
Power BI |
Power BI data set |
Datamining |
No |
Yes |
Power BI Connection type |
Live connection |
Direct/Import mode |
|
Azure Data Lake |
Azure Delta Lake |
Definition |
Data Lake Storage Gen2 allows you to
easily manage massive amounts of data. A fundamental part of Data Lake
Storage Gen2 is the addition of a hierarchical namespace to Blob storage. The
hierarchical namespace organizes objects/files into a hierarchy of
directories for efficient data access. |
Delta lake is an open-source storage
layer from Spark which runs on top of an Azure Data Lake. Its core
functionalities bring reliability to the big data lakes by ensuring data
integrity with ACID transactions |
Data format |
All the raw data coming from
different sources can be stored in an Azure Data Lake without pre-defining a
schema for it. Azure Data Lake can contain all
types of data from many different sources without any need to processing it
first. |
|
Data integrity |
Azure Data Lake usually has multiple
data pipelines reading and writing data concurrently. It's hard to keep data
integrity due to how big data pipelines work |
|
ACID transactions |
If a pipeline fails while writing to
a data lake it causes the data to be partially written or corrupted which
highly affects the data quality. |
Delta is ACID compliant which means
that we can guarantee that a write operation either finishes completely or
not at all which avoids corrupted data to be written |
Unified batch and stream sources and
sinks |
There is no possibility to have
concurrent jobs reading and writing from/to the same data. |
With Delta, the same functions can
be applied to both batch and streaming data and with any change in the
business logic we can guarantee that the data is consistent in both sinks. |
Schema enforcement & Schema
evolution |
The Incoming data can change over
time. In a Data Lake this can result in data type compatibility issues,
corrupted data entering your data lake etc. |
With Delta, a different schema in
incoming data can be prevented from entering the table to avoid corrupting
the data. If enforcement isn’t needed, users
can easily change the schema of the data to intentionally adapt to the data
changing over time |
Time Travel |
In a Data lake, data is constantly
modified so if a data scientist wants to reproduce an experiment with the
same parameters from a week ago it would not be possible unless data is
copied multiple times |
With Delta, users can go back to an
older version of data for experiment reproduction, fixing wrong
updates/deletes or other transformations that resulted in bad data, audit
data etc. |
Prevent Data corruption |
|
A Delta Lake table may include NOT NULL
constraints on columns, which cannot be enforced on a regular Parquet
table. This prevents records from
being loaded with NULL values for columns which require data By using a MERGE statement, a
pipeline can be configured to INSERT a new record or ignore records that are
already present in the Delta Table. |
Query execution |
An expensive LIST operation on
the blob storage for each query |
Delta transaction log
serves as the manifest. |
Processing engine |
Apache Spark |
Delta engine - Photon, a native vectorization
engine written in C++. it’s optimized with performance
features like indexing, Delta Lake customers have seen ETL workloads execute
up to 48% faster. |
Achieve Compliance - New laws such
as GDPR and CCPA require that companies be able to purge data pertaining to a
customer should a request by the individual be made. |
|
Delta Lake includes DELETE and
UPDATE actions for the easy manipulation of data in a table. |
|
Dedicated SQL Pool |
Serverless SQL Pool |
Apache Spark pool |
Architecture |
It uses a node-based architecture
where applications can connect and issue T-SQL to a control node. That
control node is the single point of entry in Synapse SQL. Architecture The control node on top is the brain
of the MPP engine. It coordinates all compute activity. |
It’s a node-based design based on
the DDP system, or Distributed Data Processing system. In this system query’s
are split into smaller query’s, and executed on the compute nodes. The
control node on top utilized the distributed query engine to optimize the query’s
for parallel processing. Each small query is called a “task” and represents a
distributed execution unit. |
Apache Spark is a parallel
processing framework that supports in-memory processing to boost the
performance of big-data analytic applications. |
Single source of truth |
Yes |
No |
|
Scale up and down/Compute |
Manual -Yes |
Autoscale - No as it is managed by
microsoft |
Autoscale |
Use cases |
Power BI Direct query |
To explore the data in ad-hoc modus You can analyze CSV, Parquet, Delta
and other file formats stored in Azure Storage by using simple T-SQL query’s. Build a Logical Data Warehouse |
|
|
|
“Pay-per-query” or in other words
you only pay for the amount of data that is processed. |
|
Payment Model |
Cost per hour (DWU) |
Cost per TB of data processed |
|
Data Retrieval |
Fast: Distribution Strategy’s can be
implemented (Round-robin, Hash, …), Query Caching. |
Mediocre: Results cannot be cached,
data needs to be fetched from storage. |
|
Ingest data back to Datelake |
Yes |
No – you can only query data lake |
|
Data storage |
Relational database |
Data lake |
|
|
Synapse |
Azure SQL DB |
Use Case |
Azure Dedicated SQL pool
(formerly SQL DW) best fit for online database query answering system (fit
for OLAP and Data Warehousing). |
Azure SQL Database is optimized
for doing CRUD operations (Create, Read, Update and Delete) that you
typically perform from an application. |
Architecture |
Azure Synapse Analytics
uses massively parallel processing
(MPP). This means that a query is processed by a
dedicated node that has its own CPU and Memory. Azure Dedicated SQL pool
can be stored in multiple machines and provide better Scale demands. |
Azure SQL uses symmetric
multiprocessing (SMP)
|
Active geo-replication |
No |
Yes |
Dynamic Data Masking |
No |
Yes |
Data Encryption at rest |
Yes |
Yes |
Polybase T-SQL queries |
Yes |
Yes |
Automatic Tuning |
No |
Yes |
Ability to pause and resume |
Yes |
No |
Max amount of data per
database |
1PB |
4TB |
Max concurrent open
sessions |
1024 |
30000 |
Max concurrent queries |
32 |
6400 |
Use for application
database |
No |
Yes |
Use for data warehouse with
large amounts of data and small amounts of users |
Yes |
No |
Use for data warehouse with
data max 4TB and large amount of users |
No |
Yes |
Triggers |
No |
Yes |
Partitioning methods |
Sharding, horizontal
partitioning |
tables can be distributed
across several files (horizontal partitioning); sharding through federation |
Foreign keys |
No |
Yes |
Scalability |
Scale Out (Horizontal) |
Scale Up (Vertical) |
Units of Performance |
Data Warehouse Units (DWU)
and Compute DWU (cDWU) |
Database Transaction Units
(DTU) |
Max Size |
Up to 1 Petabyte with
compression (240TB on disk) |
4TB |
Pricing Range per month
(approx.): |
£900 - £130k but you are
able to pause it to reduce cost |
£4 - £15k per month |
Backup & Recovery |
Geo Redundancy Backup every
24hrs |
Geo Redundant Backup every
5-10 minutes, point in time recovery, readable replicas |
Concurrent Sessions |
1024 |
30000 |
Concurrent Queries |
32 |
6400 |
JSON, XML and Graph Support |
No |
YES |
Encryption |
TDE |
Always Encrypted and TDE |
In Memory OLTP |
No |
YES |
Cross database queries |
No |
YES |