Implementing tail with .NET & Rx

By default the tail command-line utility displays the last 10 lines of a file to standard output. To learn Rx, I endeavor to create a full .NET 4.0 version of tail. Here’s the initial version:

using (var stream = new FileStream (
    args[0],
   
FileMode
.Open,
   
FileAccess
.Read,
   
FileShare
.Read,
    2 << 15,
   
true
)) {
    stream.AsyncReadLines ().TakeLast (10).Run (
Console.WriteLine);
}

The majority of this code is the .NET call to open the file using async I/O. The 1st Rx method call is AsyncReadLines () which transforms the .NET stream into an IObservable<string>. TakeLast(10) ignores all strings observed except for the last 10 lines. Run() blocks the current thread until the IObservable<T> fires OnCompleted(), and I also pass in Console.WriteLine as the argument to use for the Action<string> delegate to write the last ten lines to the console. The output of this code is identical to running tail without any parameters.

Sequence membership

Given two sorted sequences, this method returns the intersection, a-b, and b-a, using a single iteration over both sequences. The return type is an IEnumerable<KeyValuePair<T, int>>, where:


  • only in A = -1

  • in both = 0

  • only in B = 1


/// <summary>
/// Return an int for every element in sortedSetA and sortedSetB specifying objects
/// that belong to A but not to B (-1), objects which are both in A and in B (0), and objects
/// that belong to B but not A (1).
/// </summary>
/// <remarks>This method is an O(n+m) operation when the two sets have different members, where n
/// is the count of <paramref name="sortedSetA"/> and m is the count of <paramref name="sortedSetB"/>.
/// Otherwise the operation approaches O(n+m-n) where m is the count of the larger set and n is the
/// smaller set.
/// </remarks>
/// <typeparam name="T">Type of element in each set</typeparam>
/// <param name="setA">A set where each element is of type <typeparamref name="T"/></param>
/// <param name="setB">A set where each element is of type <typeparamref name="T"/></param>
/// <returns>Returns A \ B (-1), the intersection of A and B (0), and B \ A (1)</returns>
public static IEnumerable<KeyValuePair<T, int>> Membership<T> (IEnumerable<T> sortedSetA, IEnumerable<T> sortedSetB) {
return Membership<T> (sortedSetA, sortedSetB, Comparer<T>.Default);
}
/// <summary>
/// Return a KeyValuePair<int, T> for every element in sortedSetA and sortedSetB where the int
/// specifies membership and T is the element. The int specifies objects that belong to A but
/// not to B (-1), objects which are both in A and in B (0), and objects that belong to B but
/// not A (1).
/// </summary>
/// <remarks>This method is an O(n+m) operation when the two sets have different members, where n
/// is the count of <paramref name="sortedSetA"/> and m is the count of <paramref name="sortedSetB"/>.
/// Otherwise the operation approaches O(n+m-n) where m is the count of the larger set and n is the
/// smaller set.
/// </remarks>
/// <typeparam name="T">Type of element in each set</typeparam>
/// <param name="setA">A set where each element is of type <typeparamref name="T"/></param>
/// <param name="setB">A set where each element is of type <typeparamref name="T"/></param>
/// <param name="comparer"><see cref="IComparer"/> used to sort the sets</param>
/// <returns>Returns A \ B (-1), the intersection of A and B (0), and B \ A (1)</returns>
public static IEnumerable<KeyValuePair<T, int>> Membership<T> (IEnumerable<T> sortedSetA, IEnumerable<T> sortedSetB, IComparer<T> comparer) {
const int onlyA = -1;
const int both = 0;
const int onlyB = 1;

if (sortedSetA == null)
throw new ArgumentNullException ("sortedSetA");
if (sortedSetB == null)
throw new ArgumentNullException ("sortedSetB");
if (comparer == null)
throw new ArgumentNullException ("comparer");

IEnumerator<T> enumeratorA = sortedSetA.GetEnumerator ();
IEnumerator<T> enumeratorB = sortedSetB.GetEnumerator ();

bool nextA = enumeratorA.MoveNext ();
bool nextB = enumeratorB.MoveNext ();

// default value for type T
T a = default (T);
// default value for type T
T b = default (T);

// if both collections have a value
if (nextA & nextB) {
// get current value
a = enumeratorA.Current;
// get current value
b = enumeratorB.Current;

do {
// Compare a to b: is a < b, a == b, or a > b ?
int val = comparer.Compare (a, b);
// a == b
if (val == 0) {
// return both collections have the value a
yield return new KeyValuePair<T, int> (a, both);
// if collection a can move next
if (nextA = enumeratorA.MoveNext ())
// get the next a
a = enumeratorA.Current;
// if collection b can move next
if (nextB = enumeratorB.MoveNext ())
// get the next b
b = enumeratorB.Current;
}
// a < b
else if (val < 0) {
// return only collection a has the value a
yield return new KeyValuePair<T, int> (a, onlyA);
// if collection a can move next
if (nextA = enumeratorA.MoveNext ())
// get the next a
a = enumeratorA.Current;
}
// a > b
else {
// return only collection b has the value b
yield return new KeyValuePair<T, int> (b, onlyB);
// if collection b can move next
if (nextB = enumeratorB.MoveNext ())
// get the next b
b = enumeratorB.Current;
}
// loop while there are values for both collections
} while (nextA & nextB);
}
// if collection a has more values
if (nextA) {
// return only collection a has the value a
yield return new KeyValuePair<T, int> (a, onlyA);
// if collection a can move next
while (enumeratorA.MoveNext ()) {
// return only collection a has the value a
yield return new KeyValuePair<T, int> (enumeratorA.Current, onlyA);
}
}
// if collection b has more values
else if (nextB) {
// return only collection b has the value b
yield return new KeyValuePair<T, int> (b, onlyB);
// if collection b can move next
while (enumeratorB.MoveNext ()) {
// return only collection b has the value b
yield return new KeyValuePair<T, int> (enumeratorB.Current, onlyB);
}
}
}

c# Language request for properties

Auto-Implemented properties are great, until you need to implement custom get or set logic. I love the ability to make the field an implementation detail, until you need to implement the property. What if you could continue to hide the field, like this:

public string FirstName {
get {
// do more stuff, like lazy init the field
if (field == null)
field = "Unknown";
return field;
}
set {
field = value;
}
}

A new keyword, field, references the compiler generated property’s backing field. Also, for readonly fields:

public readonly string Id {
get {
return field;
}
set { // private is implied
// do more stuff, like don't accept null
if (value == null)
value = string.Empty;
field = value;
}
}

Where the property would need to be set in the constructor (like read-only fields) as the backing field would be marked read-only. For simplicity, the implementation would use simple property syntax while the compiled output would involve a read-only field, a truly read-only property and a constructor that executes the property’s set body via a static method call each time the field is set in the constructor.

Please rate and validate this sugestion at the MSDN Microsoft Product Feedback Center.

LINQ to SQL produces incorrect TSQL when using UNION or CONCAT

When a LINQ to SQL query contains a Union or Concat with a second query, and the second query references a column twice, a SqlException will occur.

var a = from address in dc.Addresses
select new {
ID = address.AddressID,
Address1 = address.AddressLine1,
Address2 = address.AddressLine2,
};
var b = from address in dc.Addresses
select new {
ID = address.AddressID,
Address1 = address.AddressLine1,
Address2 = address.AddressLine1, // notice AddressLine1 repeated
};
var q = a.Take(10).Union (b.Take(10));
q.ToArray ();



SqlException: All the queries in a query expression containing a UNION operator must have the same number of expressions in their select lists.

SELECT [t2].[AddressID] AS [ID], [t2].[AddressLine1] AS [Address1], [t2].[AddressLine2] AS [Address2]
FROM (
SELECT TOP (10) [t0].[AddressID], [t0].[AddressLine1], [t0].[AddressLine2]
FROM [Person].[Address] AS [t0]
UNION
SELECT TOP (10) [t1].[AddressID], [t1].[AddressLine1]
FROM [Person].[Address] AS [t1]
) AS [t2]



Notice the third SELECT statement is only selecting two columns instead of the required three.

Please rate and validate this bug at the MSDN Microsoft Product Feedback Center so Microsoft responds with a solution or workaround.