LINQ to SQL produces incorrect TSQL when using UNION or CONCAT

When a LINQ to SQL query contains a Union or Concat with a second query, and the second query references a column twice, a SqlException will occur.

var a = from address in dc.Addresses
select new {
ID = address.AddressID,
Address1 = address.AddressLine1,
Address2 = address.AddressLine2,
};
var b = from address in dc.Addresses
select new {
ID = address.AddressID,
Address1 = address.AddressLine1,
Address2 = address.AddressLine1, // notice AddressLine1 repeated
};
var q = a.Take(10).Union (b.Take(10));
q.ToArray ();



SqlException: All the queries in a query expression containing a UNION operator must have the same number of expressions in their select lists.

SELECT [t2].[AddressID] AS [ID], [t2].[AddressLine1] AS [Address1], [t2].[AddressLine2] AS [Address2]
FROM (
SELECT TOP (10) [t0].[AddressID], [t0].[AddressLine1], [t0].[AddressLine2]
FROM [Person].[Address] AS [t0]
UNION
SELECT TOP (10) [t1].[AddressID], [t1].[AddressLine1]
FROM [Person].[Address] AS [t1]
) AS [t2]



Notice the third SELECT statement is only selecting two columns instead of the required three.

Please rate and validate this bug at the MSDN Microsoft Product Feedback Center so Microsoft responds with a solution or workaround.

6 comments on “LINQ to SQL produces incorrect TSQL when using UNION or CONCAT

  1. A workaround to this issue is to force LINQ to generate some non-affecting string manipulation on the second occurance of the column. eg:

    Address2 = address.AddressLine1.Replace(“”,””)

    will then generate SQL will two columns similar to below:

    SELECT AddressLine1, REPLACE(AddressLine1,”,”) as Value
    FROM Address

  2. Hello ,

    I test your code . It is really a nice.

    But in my opinion problem is not with union . Problem is with

    var b = from address in dc.Addresses
    select new {
    ID = address.AddressID,
    Address1 = address.AddressLine1,
    Address2 = address.AddressLine1, // notice AddressLine1 repeated

    Possible work arround would be

    var a = from address in dc.Addresses
    select new {
    ID = address.AddressID,
    Address1 = address.AddressLine1,
    Address2 = address.AddressLine2,
    };
    var b = from address in dc.Addresses
    select new {
    ID = address.AddressID,
    Address1 = address.AddressLine1,
    Address2 = address.AddressLine1.ToString(), // notice AddressLine1 repeated
    };
    var q = a.Take(10).Union (b.Take(10));
    q.ToArray ();
    };

    If you see query generated by this it only contain
    SELECT [t0].[AddressID], [t0].[AddressLine1]
    FROM [Person].[Address] AS [t0]

    b’coz as AddressLine column repeated so Linq internally main thoes thing . So for query optimization it only fetch one column then display in result.

    When you apply this to union so schema mismatch between two query.

  3. Hello ,

    I test your code . It is really a nice.

    But in my opinion problem is not with union . Problem is with

    var b = from address in dc.Addresses
    select new {
    ID = address.AddressID,
    Address1 = address.AddressLine1,
    Address2 = address.AddressLine1, // notice AddressLine1 repeated

    If you see query generated by this it only contain
    SELECT [t0].[AddressID], [t0].[AddressLine1]
    FROM [Person].[Address] AS [t0]

    b’coz as AddressLine column repeated so Linq internally main thoes thing . So for query optimization it only fetch one column then display in result.

    When you apply this to union so schema mismatch between two query.

    Possible work arround would be

    var a = from address in dc.Addresses
    select new {
    ID = address.AddressID,
    Address1 = address.AddressLine1,
    Address2 = address.AddressLine2,
    };
    var b = from address in dc.Addresses
    select new {
    ID = address.AddressID,
    Address1 = address.AddressLine1,
    Address2 = address.AddressLine1.ToString(), // notice AddressLine1 repeated
    };
    var q = a.Take(10).Union (b.Take(10));
    q.ToArray ();
    };

  4. one another work arround would be . But this will be litte costly performance wise.

    var a = from address in dc.Addresses
    select new {
    ID = address.AddressID,
    Address1 = address.AddressLine1,
    Address2 = address.AddressLine2,
    };
    var b = from address in dc.Addresses
    select new {
    ID = address.AddressID,
    Address1 = address.AddressLine1,
    Address2 = address.AddressLine1, // notice AddressLine1 repeated
    };
    var q = a.ToList().Take(10).Union(b.ToList().Take(10));
    q.ToArray ();

Comments are closed.