По-русски Notes

Friday, July 01, 2005

Join-like functionality in Microsoft Shell

Some implementations have been proposed. They either imply constructing new custom objects including properties of both inputs or delegate defining of the join operation to the user. Let's implement another method to join data from several streams...

The task is to produce an object having properties of two or more input objects. The first parallel is SQL JOIN clause. It virtually constructs a new table comprised by selected columns of two source tables. In object-oriented environment this sounds like declaring a class for new objects and constructing its instances. The use of reflection to emit and instantiate the class will not be investigated here, but MshObject class will be used to construct new objects.

First the point of view should be changed slightly. There are two input sequences of objects. Usually the first sequence is delivered via the pipeline while the second was generated earlier and passed as a parameter. Selected set of properties could be added to existing object from pipeline input exploiting the obvious asymmetry. MshNoteProperty class will help to add properties. The following function adds properties according to given hashtable.

function global:add-note{
  param([collections.hashtable] $hash)
  process{
    foreach ($key in $hash.keys) {
      # process each key-value pair
      $note=new-object management.automation.mshNoteProperty 
                                   -arguments $key, $hash[$keys]
      $_.mshObject.properties.add($note)
    }
    $_
  }
}

MSH> gci | add-note @{'SomeProp'='SomeValue'} | ft name, someProperty
Name                                SomeProperty
----                                ------------
foo.txt                             SomeValue
...

Values could be generated by a script block. The scriptblocks use $_ automatic variable to access current object.

      ...
      # process each key-value pair
      if ($hash[$key] -is [scriptblock]) {$v=&$hash[$key]}
      else {$v=$hash[$key]}
      $note=new-object system.management.automation.mshNoteProperty 
                                                 -arguments $key, $v
      $_.mshObject.properties.add($note)
      ...

MSH> gci | 
>> add-note @{'Tp'={$_.getType()},'Btp'={$_.getType().baseType} |
>> ft name,tp,btp

Name             Tp                        Btp
----             --                        ---
foo.txt          System.IO.FileInfo        System.IO.FileSystemInfo
Coockies         System.IO.DirectoryInfo   System.IO.FileSystemInfo
...

SQL JOIN may return data from the first table more than one time. To mimic this behavior a new object should be constructed. MshObject class seems being suitable.

  process{
    # normalize args
    if ($hash -eq $null) {return}
    elseif ($hash -is [system.collections.hashtable]) {$ha=@($hash)}
    elseif ($hash -is [array]) {$ha=$hash}
    else {
      throw 'Please provide a hashtable or an array of hashtables.'
    }

    # process each hashtable
    foreach ($h in $ha) {
      $res=new-object system.management.automation.mshObject $_
      foreach ($key in $h.keys) {
        # process each key-value pair
        if ($h[$key] -is [scriptblock]) {$v=&$h[$key]}
        else {$v=$h[$key]}
        $note=new-object system.management.automation.mshNoteProperty 
                                                   -arguments $key, $v
        $res.mshObject.properties.add($note)
      }
      $res
    }
  }
  ...

MSH> gci|
>> add-note @(
>>   @{'Tp'={$_.getType()}},
>>   @{'Btp'={$_.getType().baseType}}
>> ) | ft name,tp,btp

Name               Tp                      Btp
----               --                      ---
foo.txt            System.IO.FileInfo      
foo.txt                                    System.IO.FileSystemInfo
...

Finally add support of script block producing hashtables.

      ...
      # normalize args
      if ($hash -is [scriptblock]) {$ht=&$hash}
      else {$ht=$hash}

      if ($ht -eq $null) {return}
      elseif ($ht -is [system.collections.hashtable]) {$ha=@($ht)}
      elseif ($ht -is [array]) {$ha=$ht}
      else {
        throw 'Please provide a hashtable, 
          an array of hashtables, or a scriptblock returning such an array.'
      }
        
      # process each hashtable
      ...

MSH> gci .\dir1 | add-note {
>>  foreach ($f in gci .\dir2) {
>>    if($f.name -eq $_.name){@{dir1=$_;dir2=$f}}
>>  }
>>} | ft dir1,dir2

Note that the scriptblock not only generates values for the properties, but functions as ON part of SQL JOIN clause. In the following example the scriptblock filters the rows like WHERE.

MSH> gci | add-note {
>>  if ($_ -is [io.directoryInfo]) {
>>    $s=0
>>    foreach ($f in gci $_.fullName -rec) 
>>      {$s+=$f.length}
>>    @{TotalLength=$s}
>>  }
>>} | ft name, totalLength

Name                          TotalLength
----                          -----------
Links                                1554
Vision                              12362
...
more...