In the assignment, you will be given a “Overarching research question” that you are asked to investigate by answering 7 specific questions. Rather than asking for specific tests, the overarching question is broken down in 7 tasks, which will require you to choose and execute a test. All specific questions then contribute to a better understanding of the overarching question. Your task will then be:

Start with a short review on this “Overarching research question” (around 250 words, but please note this word count is only indicative) on what is the knowledge of the topic of your assignment, based on the academic literature.

For each of the 7 specific research questions, identify the statistical tests that are most appropriate to get an answer. You are encouraged to do more than one test for each question, although this is not a requirement. For each test that you do:

Justify why you have chosen this test over another one;

Identify the null hypothesis and the alternative hypothesis;

Determine whether you reject or fail to reject the null hypothesis;

Identify the potential limitations (i.e., underlying assumptions) of the test.

Explain the implications of each result to practice and theory, with a view of answering the “Overarching research question”.

Wrap up with a short discussion and conclusions.

Put all tables at the end of the document (but before the bibliography), and do not include them in your word count.


The final mark will depend on two equally-weighted criteria:

Identification of the right tests for each question, and its correct execution (note that there may be more than one suitable test, so the argument of why you have chosen a particular test is very important for this assessment);

The results are correctly interpreted, with a view of informing policy and practice.


Overarching research question: What are the key determinants of farm income?

Dataset: Farm Business Survey.xls

This is real farm-level data from the Farm Business Survey, see

Variables in dataset:

FarmID: ID of the farm;  

Farm Size (Standard Labour Requirements): Full-time equivalent of work in farm, which DEFRA uses as an indicator of the economic size of the farm. The cut-off points are as indicated below.


Land Area Owned (hectares): land associated to the farm;

Total hours worked on farm (farmer+spouse+other unpaid labour): hours worked in the farm by the farm household;

Paid Farm Labour (hours): hours worked from other paid workforce;

Cost of Capital: the total value of non-land assets of the farm (e.g., buildings, machinery, etc.)

Paid Labour Expenses – Hired Regular Labour (£): amount paid to regular workforce;

Paid Labour Expenses – Casual Labour (£): amount paid to non-regular workforce;

Farm Business Income (£): Income of the farm

Rent Paid – Land & Buildings  (£): value of rent paid;

Robust farm types: main activity in the farm.

List of Tasks:

Summarise all variables (only mean and standard deviation will do).

Do an histogram of the farm business income variable. What do we learn from this?

Test whether the hours worked in the farm by the farm household (unpaid) is on average larger than the hours worked on farm by hired labour (i.e., the variable Paid Farm Labour (hours)). What do we learn from this result?

Test whether the farm business income variable differs across farm type. In doing so, identify which farm types are different from each other, if any. What do we learn from this association (or lack of association)?

Explore whether there is a correlation between Cost of Capital, Farm Size, and Farm Business Income. What do we learn from this result?

Cluster farms by their characteristics (only: Farm Size, Land Area Owned, Total hours worked on farm, Paid Farm Labour, Cost of Capital, Hired Regular Labour, Casual Labour, Farm Business Income, Rent Paid). How many clusters of farm do we find? What do we learn about these cluster?

Using a regression, estimate which variables have the higher return on expenses, i.e., what expense items leads to the highest farm income. In doing so, control for farm size and area of the farm. Be selective in the use of variables, and explain why you are including the ones you are including. You will not need all the variables in the regression, and overfitting will be penalised.


