Weathering Thru Tech Days: AM: BMR Book Errata

Friday, March 11, 2016

AM: BMR Book Errata

3/9/2016

Section 4.11.1

The `rowSums` formula should read
\[
\mathbf{A1}=\sum_{i}^{n}\mathbf{A}_{*i},
\]
and the `colSums` formula should read
\[
\mathbf{A}^{\top}\mathbf{1}=\sum_{i}^{m}\mathbf{A}_{i*}
\]
Note: it is a little bit confusing. Following R semantics, the rowSums() method means "row-wise sum" which is the same as "sum of columns"; and vice versa, the colSums() method means "column-wise sum", which can be computed as the sum of matrix rows.

Section 6.1

The dimensions of matrix $$\mathbf{V}$$ are $$\mathbf{V}\in\mathbb{R}^{n\times k}$$.
Formula (6.1) should read
\[
\begin{equation}
\boldsymbol{a}_{pca}=\mathbf{V}^{\top}\left(\boldsymbol{a}-\boldsymbol{\mu}\right).\label{eq:to-pca}
\end{equation}
\]
Formula (6.2) - (6.3) should read
\[
\begin{eqnarray}
\boldsymbol{a} & = & \left(\mathbf{V}^{\top}\right)^{\text{-}1}\boldsymbol{a}_{pca}+\boldsymbol{\mu} \\
& = & \mathbf{V}\boldsymbol{a}_{pca}+\boldsymbol{\mu}.\label{eq:from-pca}
\end{eqnarray}
\]

3/12/2016

Section 8.3

In Step (1): Setup working directories and acquire data, the URL of the Wikipedia XML dump has since changed. The third command of Step (1) should read:

curl http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles10.xml-p002336425p003046511.bz2 -o $WORK_DIR/wikixml/enwiki-latest-pages-articles.xml.bz2

We keep the Kindle edition updated with the Errata. If you have ever bought the print version from Amazon, the Kindle version is free via Amazon MatchBook. If you already have a Kindle version, you should be able just to reload it with the updated one.

8 comments:

UnknownMarch 12, 2016 at 4:41 AM
Hi,
Firstly I must say that your book is excellent. Thank you very much for writing the book that leads us to develop many distributed mathematical algorithms. I have a problem for the first example of the book.
Page 19., Example 2.4 Simulating regression data with a small noise
When I declare mxData object the scala interpreter launches that error: error: recursive value mxData needs type
Can you explain me why?
ReplyDelete
Replies
UnknownMarch 13, 2016 at 1:24 AM
Thanks Dmitriy for your reply. I have created a replica from your code repository and it is working very well.

Kind regards.
ReplyDelete
Replies
UnknownNovember 14, 2016 at 9:45 PM
hello Sir,
Firstly thanks for your excellent book.I have a problem for the example of parallel matrix multiplication.My code is here :
package myMahoutApp.mthread

/**
* Created by chomon on 11/14/16.
*/
import org.apache.log4j.{BasicConfigurator, Level}
import org.scalatest.{FunSuite, Matchers}
import org.apache.mahout.math._
import scalabindings._
import RLikeOps._
import org.apache.mahout.logging._

import scala.concurrent.duration.Duration
import scala.concurrent.{Await, Future}

class MThreadSuite extends FunSuite with Matchers {
BasicConfigurator.configure()
private[mthread] final implicit val log = getLog(classOf[MThreadSuite])
setLogLevel(Level.DEBUG)

test("mthread-mmul") {

val m = 5000
val n = 300
val s = 350

val mxA = Matrices.symmetricUniformView(m, s, 1234).cloned
val mxB = Matrices.symmetricUniformView(s, n, 1323).cloned

// Just to warm up
mxA %*% mxB
MMul.mmulParA(mxA, mxB)

val ntimes = 30

val controlMsStart = System.currentTimeMillis()
val mxControlC = mxA %*% mxB
for (i ← 1 until ntimes) mxA %*% mxB
val controlMs = System.currentTimeMillis() - controlMsStart

val cMsStart = System.currentTimeMillis()
val mxC = MMul.mmulParA(mxA, mxB)
for (i ← 1 until ntimes) MMul.mmulParA(mxA, mxB)
val cMs = System.currentTimeMillis() - cMsStart

debug(f"control: ${controlMs / ntimes.toDouble}%.2f ms.")
debug(f"mthread: ${cMs / ntimes.toDouble}%.2f ms.")

trace(s"mxControlC:$mxControlC")
trace(s"mxC:$mxC")

(mxControlC - mxC).norm should be < 1e-5

def mmulParA(mxA: Matrix, mxB: Matrix): Matrix = {
val result = if (mxA.getFlavor.isDense)
mxA.like(mxA.nrow, mxB.ncol)
else if (mxB.getFlavor.isDense)
mxB.like(mxA.nrow, mxB.ncol)
else mxA.like(mxA.nrow, mxB.ncol)

val nsplits = Runtime.getRuntime.availableProcessors() min mxA.nrow
val ranges = createSplits(mxA.nrow, nsplits)
val blocks = ranges.map { r ⇒
Future {
r → (mxA(r, ::) %*% mxB)
}
}

Await.result(Future.fold(blocks)(result) {
case (result, (r, block)) ⇒
result(r, ::) := block
result
}, Duration.Inf)
}
}

def createSplits(nrow: Int, nsplits: Int):
TraversableOnce[Range] = {
val step = nrow / nsplits
val slack = nrow % nsplits
((0 until slack * (step + 1) by (step + 1)) ++

(slack * (step + 1) to nrow by step))
.sliding(2).map(s => s(0) until s(1))

}
}

Although I gave mmulParA method,it appears "cannot resolve symbol mmulParA".
How can I solve?
ReplyDelete
Replies

Add comment