So, uh, I play roller derby. Recently, I wrote an essay outlining
some of the things that I’ve learned by/about playing roller derby over
the past couple of years. I originally posted this essay on my league’s
forum, but the more I think about it, the more I believe that many of
these points apply (albeit in some modified form) to academia. So I’m
also posting a link to it here too. Enjoy!
Miscellaneous | No Comments »
October 24th, 2012
Barack Obama on reasoning under uncertainty:
“Nothing comes to my desk that is perfectly solvable,”
Obama said at one point. “Otherwise, someone else would have solved it.
So you wind up dealing with probabilities. Any given decision you make
you’ll wind up with a 30 to 40 percent chance that it isn’t going to
work. You have to own that and feel comfortable with the way you made
the decision. You can’t be paralyzed by the fact that it might not work
out.” On top of all of this, after you have made your decision, you need
to feign total certainty about it. People being led do not want to
think probabilistically.
(From this article.)
Miscellaneous | 1 Comment »
September 17th, 2012
Ten years ago, I was an M.Sc. student at the University of Edinburgh,
about to start my thesis research on “FSA Induction for Real World
Datasets.” A little over ten years ago to the day, Miles Osborne,
my thesis advisor, emailed me to say that there was a talk taking place
the next day that I absolutely had to attend: “From Grep to Graphical
Models” by Fernando Pereira
from the University of Pennsylvania. As someone whose academic
interests included UNIX tools, regular expressions, finite state
automata, HMMs, and graphical models, my mind was blown by the title
alone.
I attended the talk and I loved every second of it. Fernando introduced conditional random fields, a new probabilistic model for sequential data that he and two other researchers, Andrew McCallum (or “??? from WhizBang” according to my handwritten notes!) and John Lafferty,
had just invented. Sure, I didn’t understand everything he said, but to
me, CRFs sounded like the most interesting thing on earth: not only did
they draw on ideas from HMMs and multiclass logistic regression (which
I’d recently learned about in Miles’ class on “Data Intensive
Linguistics”), CRFs were undirected graphical models, unlike the
(directed) graphical models that I’d previously encountered. At the end
of Fernando’s talk, I told Miles that I wanted to do my thesis research
on CRFs. Miles laughed and said he’d been planning to convince me to
work on CRFs anyway, so this switch was perfect.
Miles arranged for me to meet with Fernando. I still have my
handwritten notes from that meeting. I learned about all kinds of things
that, at the time, I only half—no, maybe one quarter—understood. Each
one of them sounded unbelievably exciting to me. When I told Fernando
that would be starting a Ph.D. at the University of Cambridge after my
M.Sc., he asked me why I hadn’t applied to Penn. My response (which I’m
still embarrassed by!) was that I’d never heard of Penn. He told me that
it wasn’t too late to change my mind, but I was firmly convinced that
Cambridge was the center of the universe and, besides, I’d never heard
of this no-name university, Penn.
I spent the rest of my M.Sc. working on CRFs. I spent months poring
over the original CRF paper. I rederived every single equation myself. I
tracked down books and papers from the ’70s and ’80s about Markov
random fields, the Hammersley Clifford theorem, and semirings. I
downloaded statistics articles from JSTOR. I read machine learning
papers on everything from factorial HMMs to numerical optimization
methods. It was an awesome experience (and ultimately resulted in this).
At the end of my M.Sc., I moved to Cambridge and started my Ph.D. Three months later, in December, I attended my first NIPS
conference. I had one goal: to talk to all three authors of the CRF
paper and to convince one of them to let me work with them over the
following summer. I achieved my goal. I ran into Fernando on a staircase
in the Westin, I introduced myself to Andrew McCallum after a talk, and
I cornered John Lafferty (poor man!) in an elevator. All three said
yes.
On Christmas Day (or maybe Christmas Eve?) I received an email from
Fernando saying he’d just read my M.Sc. thesis and had some questions.
(To cut a long story short, due to my lack of a statistics/probability
background, I’d made a “thinko” regarding linearity of expectation.
Oops.) I was amazed and honored that he’d read it—and that he’d done so
over his Christmas break.
Over the next few months, I received emails from both Fernando and
Andrew; however, as a disorganized Ph.D. student, unsure of what I
wanted to do, I don’t think I replied. Eventually, in May, I received a
contract from Fernando indicating that I would be spending three months
at Penn, starting in June. (We may have exchanged emails prior to that,
but I’m not sure.) I signed it.
On June 19, I flew from London to Philadelphia. I took the SEPTA
train from the airport to the Penn campus, where I single-handedly
lugged my giant suitcase up a flight of stairs in the hottest
weather I’d ever experienced. I somehow managed to make my way to the CS
department (this was back in the days before GPS and smartphones etc.),
where I found Fernando in his office.
I’m not really sure what to say at this point because, really, that
was when my life as an adult began. Spending that summer at Penn was the
biggest adventure of my life. And I loved every second of it. Okay,
sure, I got mono and spent a month incredibly sick and barely able to
move, but apart from that, it was amazing. At the end of the summer, I
didn’t want to return to Cambridge.
To cut a long story short, Fernando offered to let me stay at Penn
for a year. Somehow, one year turned into four, and I ended up doing a
strange, bicontinental Ph.D. Fernando paid my salary and paid
for my trips back to Cambridge to satisfy my Ph.D. requirements. I’m
still not entirely sure what he got out of it—I never co-authored any
publications with him and his name didn’t end up on the cover of my
Ph.D. thesis. In fact, for most of the time I was at Penn, I wasn’t even
working on topics that aligned with his research agenda. And yet, he
was unfailingly supportive. He funded most of the first WiML workshop.
He met with me regularly and gave me advice on my research, as well as
academia in general. And, most importantly, he believed in me even when
my confidence was at an all-time low. I am indescribably grateful to him
for his generosity. I’m not sure I will ever be able to convey the
extent to which his generosity changed my life and made me who I am.
Ten years later, I’m an assistant professor. I’ve worked with two of
the three authors of the CRF paper. (When I left Penn, I did so to do a
postdoc with Andrew McCallum.) The WiML workshop is now in its seventh
year. I do research on topics that I love. But most importantly, I hope
that one day I’m able to be as good a mentor to, well, ANYONE as
Fernando was—and still is—to me.
TL;DR: Thanks, Fernando. You rock.
Miscellaneous | 2 Comments »
May 5th, 2012
Tim sent me a link to this blog post, which consists of a quote from this blog post. I think the quote is well worth taking seriously, especially if one is a Ph.D. student, postdoc, or young faculty:
[A] young person told me that I could hold to my principles about the
importance of my family, honesty and equality—and any of a hundred other
things because I had “made it.” This troubled me. It troubles me when I
hear the same thing from new Ph.Ds who are trying to get tenure. I don’t see how you can pretend to be someone else for 5 or 10 years until you have “made it” and then be your true self.
Miscellaneous | No Comments »
May 5th, 2012
For the past two years, the UMass Amherst Computational Social Science Initiative has been running a weekly seminar series. We’ve had some amazing
speakers: thirty-one of them to be precise (though I may have
miscounted). But here’s the really exciting bit: we videoed twenty-three
of the talks and the videos are available online. As far as I’m concerned, this is an unbelievable
resource for anyone interested in computational social science and, as
someone who was present at almost every one of these talks, I can tell
you they definitely worth watching.
Miscellaneous | No Comments »
May 4th, 2012
Mark Dredze and I have written a guide on “How to Be a Successful Ph.D. Student.”
It’s a work in progress, so we’re looking for feedback from everyone:
those who were/are Ph.D. students, and those who advise students. You
can either post feedback here or email me directly.
Articles | 4 Comments »
March 21st, 2012
Recently, I’ve been trying to follow two complementary pieces of
advice on paper-writing. The first, which I’ve known about for some time
now and disucssed with my research group back in September, is Jason
Eisner’s “Write the Paper First.”
Jason advocates going against the trend of last-minute paper-writing
commonly found in computer science. He provides many well-argued reasons
for this viewpoint, but one of the most compelling (to me, at least) is
the following:
But you can’t write effectively [on little sleep]. Writing involves many
big and small decisions, which will seem insurmountable when you’re
exhausted and panicked.
This observation ties in especially well with recent research on decision fatigue.
The second piece of advice is George Whitesides’ “Writing a Paper.” Whitesides (who has the highest h-index of any living chemist) argues for the continual use of outlines when writing papers:
A paper is not just an archival device for storing a completed research
program; it is also a structure for planning your research in progress.
[…] A good outline for the paper is also a good plan for the research
program. You should write and rewrite these plans/outlines throughout
the course of the research. […] The continuous effort to understand,
analyze, summarize, and reformulate hypotheses on paper will be
immensely more efficient for you than a process in which you collect
data and only start to organize them when their collection is
“complete.”
Of particular relevance to advisors and students is the following paragraph:
An outline […] contains little text. If you and I can agree on the
details of the outline (that is, on the data and organization), the
supporting text can be assembled fairly easily. If we do not agree on
the outline, any text is useless. […] It can be relatively efficient in
time to go through several (even many) cycles of an outline before
beginning to write text; writing many versions of the full text of a
paper is slow.
Articles | 1 Comment »
March 18th, 2012
Posting in my capacity as the current chair of the WiML Executive Board:
We are seeking new members to join the Women in Machine Learning (WiML)
Executive Board. The goals of the Executive Board include ensuring the
continued success of the annual WiML Workshop, facilitating activities
with a lifespan longer than one year, securing funding, and handling
publicity and communications. In previous years, the Executive Board has
secured grants to fund the Workshop, analyzed impact statistics, and
established a mentoring program, as well as advising the Workshop
organizers and maintaining infrastructure.
Prospective Executive Board members must
- commit to a two year position,
- have participated in at least one WiML Workshop,
- demonstrate active participation in the machine learning community,
- have experience/interest in broadening women’s participation in CS.
Senior Ph.D. students, postdoctoral researchers, faculty, and
research scientists (in both industry and academia) are encouraged to
apply.
Applicants must submit a short (500 words or fewer) statement of interest and suitability, to include the following information:
- Affiliation,
- relevant previous experience,
- reasons for interest and suitability.
Statements should be sent to wimlworkshop@gmail.com by March 30, 2012.
Opportunities | No Comments »
March 17th, 2012