IronPython Class Factory for MSBuild 4.0 Part 4
Example 4: PyProduceHashManifest task, make a more concise version of the C#
ProduceHashManifest Inline Task from an earlier article
I had a previous article on MSBuild 4.0 that discussed a custom task
named ProduceHashManifest in order to illustrate some of the new features in Visual Studio/Team Foundation Server 2010. The idea was to generate an MD5 hash
value based on the contents of each file produced by Team Build, and store the results in per-build hashManifest.txt file. That ProduceHashManifest task used
something like 65 lines of C# code, written inline within an MSBuild <Task> element. Let's try to make it shorter in Python:
.targets
<!-- Task: PyProduceHashManifest, use Python to write manifest of MD5 files to log -->
<UsingTask TaskName="PyProduceHashManifest" TaskFactory="PythonClassFactory"
AssemblyFile="..\_compiled\PythonClassFactory.dll" >
<ParameterGroup>
<inFilePath ParameterType="System.String" Required="true" />
<outHash Output="true" />
</ParameterGroup>
<Task>
<![CDATA[
import sys
sys.path.append(r'.\partialPythonLib')
import hashlib
outHash = hashlib.md5((open(inFilePath, 'r').read())).hexdigest()
]]>
</Task>
</UsingTask>
<ItemGroup>
<src Include="$(OutputDirectory)\*" />
</ItemGroup>
<Target Name="PyTarget4" Outputs="%(src.FullPath)">
<PyProduceHashManifest inFilePath="@(src)">
<Output PropertyName="outHash" TaskParameter="outHash" />
</PyProduceHashManifest>
<Message Text="FILEPATH: @(src) MD5: $(outHash)" Importance="High"/>
</Target>
Back to three main elements but now <ItemGroup> has replaced <PropertyGroup>:
1) UsingTask element
One input parameter, representing the file to be hashed, and one output, holding the hash value. The code here uses a slightly different way of accessing
the Python library files. In the previous example all of the Python library files from IronPython 2.6, everything in \Lib, needed to be in an expected
location before the build took place. The files could have been deployed as part of the build but as a whole they are large enough to a) slow down the
build, and b) take up significant space on the build server - one copy of that folder per build. For PyProduceHashManifest I pulled out the .py files containing
the libraries I needed, tested to make sure they in turn didn't depend on additional .py files, and checked into TFS. Specifically, hashlib.py and md5.py
were placed in a TFS project directory named partialPythonLib, located under _targets (making it a sibling to the PythonTargets.targets file holding
all of the MSBuild implementation):
With that in place the hashlib library can be imported at build runtime and then a single line of code used to open a target file, read it, and compute
the MD5 value for the file's content.
2) ItemGroup element
If an MSBuild PropertyGroup can be thought of as a single-value variable, an ItemGroup can be considered a single-dimensional array of values,
the most common use of which is hold a list of files. When an ItemGroup holds files, certain metadata on those files may be accessed within the
MSBuild environment, which happens in the Target element below. The file list being assigned to the 'src' ItemGroup here includes everything in the
output directory, the path for which is passed in to the .targets file via the OutputDirectory property. Specifically, the MSBuild_PythonTargets
workflow Activity used its CommandLineArguments property to pass the value of 'outputDirectory' (which is defined in the workflow as being more or less
equal to the local build directory + "\Binaries") on to the .targets file.
3) Target element
Activation of PyTarget4 is a little trickly, where use of the Outputs attribute to Transform
the files in 'src' leads to Target batching (see How To: Batch Targets with Item Metadata
for further information). The end result is that the contents of PyTarget4 are processed multiple times - once for every file in the src ItemGroup, meaning
once for every file in the build's output directory. And the value assigned to the inFilePath parameter and passed on to PyProduceHashManifest for a
particular iteration is the file path of the 'current' file. A Message task is included in the target, to log the individual file paths and
matching MD5 hash values.
Kick off one last build and take note of everything demarked by a PyTarget4 header:
log
PyTarget4:
FILEPATH: C:\Builds\1\AllInOne\PythonBuildDef1\Binaries\CSEFModelFirst.exe
MD5: a02064d9ffbe3efdf3d3f64c88e3efa5
PyTarget4:
FILEPATH: C:\Builds\1\AllInOne\PythonBuildDef1\Binaries\CSEFModelFirst.exe.config
MD5: caf5819d624471bd244161f425820ed0
PyTarget4:
FILEPATH: C:\Builds\1\AllInOne\PythonBuildDef1\Binaries\CSEFModelFirst.pdb
MD5: 1f177e02f7763e21061754b918c1bee5
So four lines of Python vs. almost 70 in C#. Which is not a fair comparison because 1) I made no efforts to concisify the C# code in the first place,
2) I did make an effort to make the Python code as short as possible, where I would normally have done the file-open-and-hash on multiple lines in order
to increase readability (+ added a file.Close),
and 3) I cheated by writing the file paths and hash values to the log file instead of a standalone .txt as in the original C# ProduceHashManifest.
Probably could have gotten the Python script here down to two lines by adding hashlib.py and md5.py to TFS in the _targets directory, alongside the .targets
file, and getting rid of the two 'sys' lines. A much more theoretical solution would have been to put the two .py files in a folder specifically named 'Lib',
which I believe the IronPython executing process looks for automatically. And again, theoretically, that would have meant putting the Lib folder in
/CustomTasks/_compiled, where my IronPython.dll lives. What I saw instead was that the 'process' in this scenario appears to be msbuild.exe and the
the Lib folder would instead need to be in C:\Windows\Microsoft.NET\Framework\v4.0.30128\, not a good idea.
|
Further ideas
I haven't actually run any related tests but I don't see any reason why multiple Task Factories could not be accessed within the same MSBuild file. No doubt there
are many examples in which certain steps, as part of a larger goal, could be done most efficiently (or perhaps be done period) in C# while others would
be better written in IronPython.
The PythonClassFactory currently uses a call to CreateScriptSourceFromString on the IronPython ScriptEngine but could
presumably be modified to have the option to call CreateScriptSourceFromFile instead. Then to follow the structure implemented
by Microsoft's original CodeTaskFactory, a Code element would store the location of the .py file in a Source attribute. The presence of that attribute
would indicate CreateScriptSourceFromFile should be used within the Task Factory instead of default FromString.
See also: Custom task for reporting on recent check-ins to TFS
THE END.