April 18, 2013 / jphoward

HBR Visualization Webinar data

I’ll be presenting an HBR Visualization Webinar tomorrow. For those interested in following along, here are the data files I’ll be using:

And here is some R code that we’ll use for the bulldozers data set:

df = read.csv("bulldozers.csv")
samp = df[sample(1:nrow(df), nrow(df)/10, replace=FALSE),]
write.csv(samp, "bulldozers_samp.csv", row.names=F)


appendNAs <- function(dataset) {
  append_these = data.frame([, names(dataset)] ) )
  names(append_these) = paste(names(append_these), "NA", sep = "_")
  append_these = colwise(identity, function(x) any(x))(append_these)
  dataset = cbind(dataset, append_these)
  dataset[] = -1000

samp2 = appendNAs(samp)
f0 = function(x) {
  if (nlevels(x)>32) {
    return (unclass(x))
  } else {
    return (x)
samp2 = colwise(f0)(samp2)

m = randomForest(SalePrice~., data=samp2, ntree=15, sampsize=5000, nodesize=25, do.trace=T)
partialPlot(m, samp2, Enclosure)
partialPlot(m, samp2, ProductSize)

  1. Punit (@puneethmishra) / Apr 29 2013 4:33 pm

    Hi Jeremy,

    Does HBR have plans to make the recording of the webinar available.
    They had mentioned it will be made available in a week, but still it has not been done.

    Request you to please make the webinar available, so that we can re-visit some of the important concepts you touched upon.


