Converting a VHDX to VHD with PowerShell

I was recently asked about converting Hyper-V VHDX volumes to VHD. Here’s a quick and dirty post about the conversion process.

Start an Administrator PowerShell session and do the following:

Install Hyper-V and the Hyper-V Management Tools

Note: This will require a system reboot.

Windows 10

Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V -All

Use Convert-VHD

Installing the Hyper-V Management tools added the Convert-VHD cmdlet to the system. Below are examples of how to use Convert-VHD for volume conversion.

Example 1: Convert example.vhdx to example-converted.vhd

Convert-VHD example.vhdx -VHDFormat VHD -DestinationPath C:\mpwd\example.vhd

Example 2: Convert example.vhd to example-converted.vhdx

Convert-VHD example.vhd -VHDFormat VHDX -DestinationPath C:\mpwd\example.vhdx
Converting a VHDX to VHD with PowerShell

Seeing the Light

There is light at the end of the tunnel.

So much has happened over the past few years that it’s odd to think about closure paired with continued success. With any luck I’ll be wrapping up my masters program at the end of the year and entering my second year of employment at Kroll. I’ve also been remodeling my home. This isn’t a humblebrag post. If you enjoy doing something it doesn’t automatically become easy. This year has been difficult and not unilaterally successful. Learning experiences? Yes. Fuck-ups? Also yes. Running water? Occasionally. At least no one died as a result.*

I’m excited to get back to punk-rock computing: Using my free time to research/test what I want, how I want, and blog about. That’s why this post exists. I’m kicking the rust off the ol’ WordPress install to make sure it works (I’ve been paying someone to maintain it… did they?) and that I know how to hit the “Publish” button with just the right amount of intensity. (About 4.63 intensities or more.)

There are many new tools I use since I’ve last posted that I’d love to post about. Python 3 + pandas, KAPE, and DeepBlueCLI, etc. I look forward to posting how I use them, what I use them for, and what I think the future of DFIR could look like. Also just about my life, what’s on my mind, and a few links to weird websites that remind me of how the Internet was in 1996. Webrings, anyone? More posts soon!

* That I am aware of.

 

Seeing the Light

Disruptive Technology Theory

Disruptive Technology Theory has come up frequently in my coursework and it is largely misunderstood or errantly attributed to firms or ideas that are successful. There’s a collection of articles I will post soon (I’m finishing up a summer class at the University of Arizona this week!) but thought this podcast episode warranted a share on it’s own. Here is a link and description:

The Disruptive Voice, episode 15: Is Uber Disruptive?

Is Uber disruptive? We asked five experts on the theory of disruptive innovation this question and received varying responses, yet their prescriptions for what lies ahead for Uber and the incumbent taxi companies vary less than you might think. In this episode, we revisit Professor Clay Christensen’s December 2014 article in Harvard Business Review, “What Is Disruptive Innovation,” with co-author Rory McDonald, Innosight Managing Partner Scott D. Anthony, Christensen Institute co-founder Michael B. Horn, and Forum Senior Researchers Tom Bartman and Efosa Ojomo. Also discussed: the platform business model through the lens of disruptive innovation and what’s next for Uber.

Disruptive Technology Theory

Using ScanSnap Manager to OCR non-ScanSnap PDFs

I had some PDFs that I wanted to perform optical character recognition (OCR) processing on. I have a Fujitsu ScanSnap and wanted to use the ScanSnap Manager software to do this. The management software checks supplied PDFs and will only perform procession on those which originated using ScanSnap hardware. I wanted to circumvent this and it ended up being easy.

PDFs created with a ScanSnap have the Exif tag “creator” with the model string value. You can use ExifTool by Phil Harvey to print and modify Exif data. For example:

$ exiftool -creator ~/example.pdf
Creator                         : ScanSnap Manager #iX500

The file example.pdf has the correct tag/value pair and will be processed. The next file, covfefe.pdf, does not. You can add/modify the tag to the PDF which did not originate from a ScanSnap.

$ exiftool -creator="ScanSnap Manager #iX500" ~/covfefe.pdf 
    1 image files updated
$ exiftool -creator ~/covfefe.pdf
Creator                         : ScanSnap Manager #iX500

Voila! The ScanSnap Manager software will now process the PDF. You can certainly use free OCR software but I didn’t find any of them to be quite a slick. Plus this was more fun. 🙂

Using ScanSnap Manager to OCR non-ScanSnap PDFs

Welcome to the World of Tomorrow! (Again)

I wanted to follow-up on my previous post about scraping Futurama episode ratings from IMDb. I used tools I was familiar with to get the job done but I was told by someone that I really should check out BeautifulSoup to do it all in Python. It ended up working great and I’ll continue to use BeautifulSoup for web scraping in the future. This is what I did in the IPython interpreter:

import re, requests
import numpy as np
import pandas as pd
import scipy.stats as stats
from bs4 import BeautifulSoup

# create soup object

r = requests.get("http://www.imdb.com/title/tt0149460/eprate?ref_=ttep_sa_2")
soup = BeautifulSoup(r.content)

# scrape scores

scores = []
for score in soup.find_all("td", {"align": "right", "bgcolor": "#eeeeee"}):
    scores.append(float(score.get_text().strip()))

# scrape episodes

titles = []
for title in soup.find_all("a", {"href": re.compile("\/title\/tt")}):
    if len(title["href"]) == 17:
        titles.append(title.get_text().strip())

cols = ["IMDb Rating"]

# build dataframe

frame = pd.DataFrame(scores, titles, cols)

# maths with numpy

np.std(scores)
np.mean(scores)

# maths with pandas

s = pd.Series(scores)
s.std()
s.mean()
s.describe()

pd.Series.describe(frame)

# test for normal distribution

stats.normaltest(scores)

The ISTA 350: Programming for Informatics Applications course at the University of Arizona helped me a lot after my initial post. Additionally, the book Web Scraping with Python by Ryan Mitchell is one I’d recommend keeping handy.

Welcome to the World of Tomorrow! (Again)

Scraping IMDb Futurama Episode User Ratings

Good news, everyone!

This entry is effectively a two-fer. It will show how I used some basic tools and a pinch of Python with numpy to get some of the data I needed for a class project. I took a look at my favorite television show, Futurama. I used the average Internet Movie Database (IMDb) user rating for each episode to see how many standard deviations away from the mean the top four episodes are. The ultimate goal of the project was different but this was a good way to use data to support facts.

Quick. Dirty. Scraping.

IMDb has a page with every Futurama episode and it’s average user rating. The URL is http://www.imdb.com/title/tt0149460/eprate?ref_=ttep_sa_2. Note that the direct-to-video movies are excluded (rightfully) from this list.

Let’s scrape that data!

$ curl -v http://www.imdb.com/title/tt0149460/eprate?ref_=ttep_sa_2 2>&1 | egrep -i 'users rated this' | cut -d' ' -f5 | cut -d'/' -f 1 > /tmp/scores.txt

Oddly enough the OSCP labs had me scrape this way frequently. I didn’t have the time to push the labs hard or take the practical but some information stuck. I’ll hopefully get back to that OSCP soon. 🙂

The above gives us a file (/tmp/scores.txt) with each Futurama episode user score on a new line. All I really want is the mean and standard deviation anyway — It’s easy to do with the Python interpreter.

>>> import numpy as np
>>> scores = []
>>> for line in open('/tmp/scores.txt', 'r'):
...   scores.append(float(line.strip()))
... 
>>> scores = np.array(scores)
>>> np.mean(scores)
7.8798387096774185
>>> np.std(scores)
0.58612723471593964

The mean is ~7.88 and the standard deviation is ~0.59 — I used this information to compare the top four episodes. (The highest rated episode is 2.745 standard deviations away from the mean!)

For those interested here’s a screenshot of the above in action:

Screen Shot 2017-03-28 at 7.30.45 PM

Scraping IMDb Futurama Episode User Ratings