How to conduct evaluations in fragile and conflict-affected states?

Fragile and conflict-affected states are challenging environments for international development programming, including evaluations. Sajid Chowdhury is joined by two special guests to talk about related challenges.

Monday 23 May 2016, Jon Bennett, Private: Helen Stevenson, Private: Sajid Chowdhury


Hello, and welcome to the IMC Worldwide podcast. My name is Sajid Chowdhury, and today, I am joined by two special guests to talk about evaluation of international development programmes, specifically in fragile and conflict-affected states.

My first guest is Jon Bennett, a specialist with over 30 years of experience in evaluation, post-war needs assessment, food security, internal displacement, rural development, relief, evaluation and NGO training.

He has worked in Africa and Asia in Field Director, Country Representative and Team Leader posts for the UN, World Food Programme, and Oxfam.

My second guest, Helen Stevenson, is a colleague here at IMC and project manager on several evaluations that IMC is managing for international development programme donors.

So, Jon, one of the things that we will be discussing today is evaluations in fragile and conflict-affected states. For anyone who might not be from an international development, what is an evaluation?

Evaluation is the opportunity to look at accountability for programmes, projects, whatever you are dealing with, to enable them to have an independent view on the successes and failures of that programme.

If you go back to the 1980s in particular, we hardly did any evaluations in those days. People just went out and did projects and it was assumed that what people were doing was correct.

It was assumed that it was effective. It was assumed that everything went to the right people. And then what happened, particularly in the 1990s onwards, is that people started to question that.

They said ‘Was the aid delivered in the way that it was supposed to be? Did it go to the right people? Was it effective? What was the outcome of it? And what was the longer-term impact of it?

And so the evaluation business came into its own. It started to develop skills in order to actually answer those questions. So we are now at a stage where evaluation is actually quite a business in itself.

There’s been an accumulation of experience, and techniques have been built up over the years, and now we are at a point where there are organisations that almost exclusively do evaluations.

I think that here at IMC, even a few years ago, we were not experts in monitoring and evaluation at all. And then we realised that every programme that we were working on was being asked these questions.

So we built up our expertise over time, hired new staff, and now I think it’s fair to say that every programme that we do have an M&E component, but we also have standalone evaluations of other programmes, and we have become experts in this field, slowly bit by bit responding to the demands of the donors and the international community.

I think that more and more, we are seeing learning being integrated into monitoring and evaluation. That is something that here at IMC we have taken very seriously and tried to invest a lot of resources into.

I think going back to the other issue about accountability, however, is that the public, the British public if you like, have probably become a bit more demanding themselves because know more about aid than they used to.

So because they know more, they are expecting that as taxpayers, if money is spent on, you know, significant amounts of money spent on development and relief, then they want to know that it’s spent correctly.

And they need to have that independent view outside of people who actually implement the projects. They want accountability for the way in which their money has been spent.

So how does an evaluation team fit with a programme?

We often use the term ‘critical friend’. We are not there to criticise the programme but to look at how they are performing, see how they could maybe improve, and help them develop a system to monitor progress.

There are obviously lots of different kinds of evaluation. We talk about monitoring and about evaluation, but it’s not the same thing. There is monitoring of projects or independent evaluations, which is what we have been talking about.

Maybe Jon you can talk about some of these different types of evaluations?

I mean to some extent, we have had to carve out our own space in this. Quite understandably, people who have been doing programmes in countries for example for many years would say ‘Who are you to come in here for two or three weeks and tell us something about our programme that we don’t already know?’

So we have to carve out a space for ourselves, and that space would really be in creating a different way of looking at it. Perhaps a slightly higher-level way of looking at it — ‘You know your programme and how it operates, but perhaps you haven’t looked at the overall impact of it or what consequences your programme has on other organisations in the same area.’

Those sort of questions. We have tried over many years to carve out that niche, which allows us to have some value to the people that we are working for.

The other thing is that there has to be some kind of evaluation culture in organisations. They have to accept the notion that they are going to be critically appraised and to look at what they do from a critical perspective and absorb the lessons that come out of that.

Some organisations simply don’ have that evaluation culture. You almost have to hammer them over the head with it. Others are very responsive to that.

At an even higher level, I think that evaluating a programme allows us to think about whether the model could be replicated elsewhere. Certain programmes are designed for a certain country or certain context, but might be very useful somewhere else.

Just talking about the different types of evaluation: it’s all about accountability, looking at effectiveness and efficiency and these words that we use.

But there are certainly different types of evaluations. For example, you could go into an emergency situation, and you could evaluate that situation right from the outset of the emergency all the way through to the recovery phase. You could go in, you could look at exactly what you are doing on day one.

This would be called a real-time evaluation. The other kind of evaluation, which is much more common, is that you go at the end of a programme, and you say ‘Now that you have done it, you are about to close, let’s see what the impact is over say a three-year period or something like that.

That is a much more common approach. There is the other form of evaluation, which is a sort of halfway house. You go halfway through a programme, and the reason you do it at that stage is because if there are any particular changes that need to take place, then you suggest those changes at that stage, and the programme can change course accordingly if they accept your recommendations.

So it depends on when you go into something, really, as to how valuable your evaluation is.

One of the biggest problems we have got–and I must say that this across the board for almost every development organisation in the world–is that we talk about impact, but we don’t put resources into measuring impact.

Because impact of any programme, you probably wouldn’t detect until at least three or four years down the line, from after a programme has finished in other words.

And very few organisations would ever give resources to looking at long-term impact of what they do.

There have been more people killed in aid in the last ten years than in the previous twenty because now aid workers are very much targets as well as being very subject to security issues in their day-to-day work.

I would say the type of evaluation also depends on the budget. We typically see something along the lines of one percent of the overall programme being allocated to either M&E systems or an evaluation.

You can design an expensive quantitative evaluation if you have enough funds. You can do randomised control trials, almost what we think of medical research.

But if you have a very limited budget, you might only be able to do some qualitative research, so focus group discussions, getting a group of people together to ask them about a programme, or perhaps even just a few interviews, which certainly limits your scope.

Conversely, it can be the other way around. I did an evaluation of UNICEF’s programme on the tsunami, the Asian tsunami, and they had no idea that they were going to raise so much money for the tsunami, and they put that one percent down, and the evaluation budget was enormous.

So they ended up having to use it for different purposes, publications and learning documents or whatever.

So it can work both ways. It can be either too little or too much.

And you are hoping that the one percent, which by the way is not across the board. You’re hoping that it is about right. But one thing that has happened over recent years is that every programme has something that looks like an evaluation budget, certainly if it is over a certain amount of money.

Let’s call it, arbitrarily slightly, but let us say 200,000 pounds or something. Anything over 200,000 pounds that you are spending must be evaluated. This never used to be the case, but this is the case now.

So just to be clear. That one percent, that means that if the programme is of a particular amount, then one percent of its budget should go for evaluation?

Yes. The value of the overall programme… a percentage of it would be put toward evaluation. That would be a usual process.

But of course, as I say, if it goes into millions, then obviously the evaluation budget would exceed probably what you really needed. But at least they would establish a minimum percentage.

Just talking about different contexts and countries where evaluations might take place, Jon, you gave a talk recently for colleagues at IMC, and you talked about your evaluation experiences in fragile and conflict-affected states. Can you talk us through that?

There’s been an increasing amount of looking at fragile states and concern over fragile states. Incidentally, the percentage of money given by donors like the British government into dealing with fragile states has increased over many years.

This is the same with many northern European countries now. We have closed down a lot of programmes at the same time that we have opened up more in fragile states.

It partly goes back to the whole concern around terrorism. We have to accept that ever since 2001, there has been a concern that fragile states are a breeding ground for terrorists, who are not only going to have an impact on that country but are also perhaps going to have an international impact.

There is a second reason for that, and that is that many states are regionally creating instability. So it is a stability issue around a regional corpus of countries, if you like.

So yes, there has been increasing resources put into fragile states. And what that means is that we are dealing with environments that are inherently unstable.

And we are also trying to evaluate something that is always changing. It is a very volatile context in which we are working, which creates its own challenges.

And we have to accept that you do an evaluation in country X today, and in six months’ time it will be invalid because that country has moved on.

I will give you a prime example of that. This by the way is not a fragile state: in Nepal, I was doing an evaluation of the British government’s resilience programme regarding earthquakes in Nepal, and we had an IMC team out there that I was leading, and we did a usual evaluation of how well people have prepared for the eventuality of an earthquake in Nepal.

And we finished the evaluation, and we all left, and four days later, there was a massive earthquake in Nepal. Now, we didn’t know that, they didn’t know that. It was extremely timely in that sense, and yet, as soon as we came out with our report, it was already redundant because the earthquake happened, so everybody’s attention was diverted to dealing with the consequences of the reality of this earthquake, rather than some rather dry report about preparedness.

And yet, the findings we had in that report were very pertinent to the issue of what was actually happening during the earthquake. So it was a strange situation in which we had done an evaluation that became instantly redundant.

This has other consequences as well. We are dealing with a team who we never meet, but we are also dealing with a situation that we never see.

This is remote management where there is a sort of cascading of responsibility.

We start here, we get as far as the Syrian border, and we cannot go any further, so then we rely on an indigenous organisation, who then relies on their own people on the ground who are actually living in those towns to send information about those places.

And so you are going down about three or four layers here, and it has its own in-built consequences. And one is the question of how accurate is the information that you are getting. You do not really know.

You are trying to train people remotely as well, by the way, you are not even talking to them face-to-face. You are training them in going out to get information for you, sending information, and you are hoping that the information you receive is accurate, and you cannot verify that easily.

The only thing you can do is to double-check with them later on by having some kind of debriefing session, and you hope that by doing that plus making sure that the questionnaires and everything they are using are robust, that you are getting all the information that you are need.

But it is a fairly risky programme.

When I worked for Oxfam, we thought nothing of going into war zones with a Land Rover and just driving straight into cross lines in war and driving straight into situations that were hazardous… we were not targets.

With this Syria evaluation, the way that we tried to quality-assure data was firstly by having high-level training, but at the end of the data collection, once we had done a first analysis of the transcripts coming in the field, we held Skype conversations with an interpreter as well.

So we were in Gaziantep in Turkey calling our researchers in Syria to ask them what their opinion of the programme was because they had obviously been carrying out interviews and discussions but also clarifying any issues that we had found in the transcript.

So things that might seem quite obvious to them did not quite make it once they had been translated and come to us. So clarifying issues and asking them what the challenges were when they were collecting data.

That might not have come through in the questionnaires as well.

One of the things that I have been thinking of in the context of this conversation: Helen, you had been talking about Skype calls across the border. But if we go back in time to a time when any kind of international development in a fragile and conflict-affected environment, going back before Skype or before the Internet, Jon, what was that like?

Well, we talked earlier about duty of care, and we are now very conscious about duty of care to aid workers.

There have been more people killed in aid in the last ten years than in the previous twenty because now aid workers are very much targets as well as being very subject to security issues in their day-to-day work.

But I have to say also that back in the 1980s in particular, the amount of risks that we took were absolutely disproportionate to what you would do today.

When I worked for Oxfam, we thought nothing of going into war zones with a Land Rover and just driving straight into cross lines in war and driving straight into situations that were hazardous.

We did not have that sense of danger that perhaps is now very acute. And of the reasons for that is because we were not targets. We knew then that as aid organisations, it would be extremely unfortunate if you were caught in cross-fire, but certainly you would not be the target.

So we took risks then that would not be acceptable today. And I think there has been an acute increase in concern over people’s safety. It is an insurance thing as well, to some extent.

But the downside is that people are far less able to actually go down to the ground to see stuff than they used to be. And it does mean that we have a situation where, if I take Afghanistan for example, people will say ‘I have been two years in Afghanistan.’

And they have not. They have been two years sat in an office behind a huge bunker in Kabul, and they never got out of it. They just flew into the city, stayed in the city, and flew out two years later.

They did not see anything of the country.

Something that you talked about us earlier was this concept of bunkerisation. Can you talk us through that?

By bunkerisation, I was referring to quite literally bunkers in some cases, in many cases actually, where people are hidden behind this security veil which they have created themselves, where they are unable to move into the field.

Even if they go into the field, they have to be heavily securitised by having a private security company working side-by-side with you and being only able to go in certain areas that have to be checked before you go. All that sort of stuff.

There is a lot of cushioning if you like of international aid workers from the subject of their assistance. The consequence of this is that there has been unfortunately a way in which concensus of opinion resides just in the aid world.

So we have a situation where the aid world itself talks to itself a lot. If I work for the UN, I am talking to the next UN agency, who is talking to the next NGO, who is talking to the next journalist, who is talking to the next UN agency.

And before you know where you are, you have a kind of circular system of opinion going from one mouth to the next, and everyone is sat in one room almost.

And they are not necessarily getting this information from the field, but from themselves. There is a danger, first of all, that everyone talks the same language and has the same opinion. The second is that they sort of pass on wisdom as their own, but it not really, but they are not deriving it from any private data at all.

And the third danger, which I think is a realistic danger, is that this consensus kind of becomes the paradigm that everybody is working under, and it becomes the acceptable paradigm.

So we are writing our reports with the same phrases in it. And the upshot of that is that something could be happening that we are just simply not seeing out there.

And then when it does happen, we are reacting to it, rather than anticipating it. And the reason we cannot anticipate is that we do not have the information.

So it is a problem of almost Chinese whispers, with an ever-decreasing pool of information.

I think this issue of remote management or bunkerisation of aid is certainly true in Syria, where we saw the donors, the implementing partners, the programme implementers, and the evaluation team ourselves were all relying on Syrian nationals to actually carry out programme activities, while we were sitting in Turkey or London waiting to get information back.

One of the positive sides, though, is that this particular programme had strong local ownership, and a lot of the people we spoke to during our evaluation said that it was viewed as a Syrian programme.

I am sure there was some understanding that it was donor-funded, but they did not necessarily know who that donor might be. And it certainly was not one of the most important aspects of it.

Which would bring me to an important point to make, and that is that we need to invest far more money into building the local capacity of local organisations to do their own evaluations. And we have not invested that in developing countries. And yet, most developing countries do have their own evaluation organisations. But they need the transfer of skills, and of money to enable them to do their jobs more effectively. And I think that is where we have failed, really. We have invested in ourselves to do evaluations. We are still the dominant evaluators if you like, and we are coming from outside, rather than establishing indigenous skill sets.

I think that we all work with local partners. We rely on them heavily in our evaluations. But we could do a lot more in terms of building their capacity.

For the Syria project, we worked with an incredible local firm, a research firm based in Gaziantep but with a wide network throughout Syria, and I think they are much more capable than we first realised.

When we were planning our evaluation, we though that they would just collect data for us and that we would analyse it and draw conclusions from it, but in reality, they were more than capable of doing at least the first level of analysis of the information that was coming in from Syria.

We helped them refine the tools, the questions we were going to ask people in the field, but they were a very capable organisation.

They might not be able to win contracts on their own, they might have to go through international firms like ourselves, but I think we can do a lot more in helping them develop their skills and one day have firms like themselves carry out entire evaluations independently.

This is a question for both of you: if you had to write a quick ‘how to conduct evaluations in fragile and conflict-affected environments’, what would be in there?

First, explain to the client, the person who has asked for the evaluation that there is a high degree of speculation involved.

The information that you are going to create of this, and the judgements you create from it, are not pure, in the sense that you are not using quantitative data.

You are using a lot of qualitative data, and therefore, quite a lot of flexibility needs to be allowed in these type of situations.

Secondly, despite that, it is possible to get some degree of independent view expressed, provided you ensure that the people you talk to are genuinely not connected with the programmes, or if they are connected, they are beneficiaries or that you are accessing the right people, but certainly not the people who are perpetrating this ever-decreasing loop of information.

So you have to somehow get around that and try to bring some kind of unique perspective. But other than that, it is going to be difficult, and you have to accept the security implications of this kind of thing.

I was doing an evaluation in Yemen not very long ago for two organisations, firstly for the British government, which did not allow me to do anything except sit in an armour-plated car and go to an embassy.

And then six months later, I went for one of the Geneva-based organisations, an agriculture organisation there. They just said ‘You get on with it yourself.’

So I hired a local driver and I went all over the country. Now the contrast between those two… they are both evaluations, both in the same country, both in the same context. And yet, both organisations had a very different approach to personal security.

And you could say that I took far too many risks in the second instance going all over the country. Or you could say conversely that I learned far more by doing so, and it is true, I did.

So when you are working in these fragile situations, the level of risk that you take has to partly be a personal decision, it partly has to be a responsibility of the organisation to have its own edicts in that respect, and thirdly, you have to have that compromise between personal security and the information that you are likely to get.

I remember not long ago, a senior person at DFID said that it does not matter as an evaluator if you are right, but that you raise the issue for us to discuss.

I think you also need to expect delays throughout the process. Everything that can go wrong, will at some point. And like we said before, you have to be flexible and change your approach to some extent.

In Syria, we have had quite a few delays in data collection due to the conflict.

We had to wait a few days if the city is getting bombed. We do not want to put our researchers in any risk. And the client understood this, but it is something inevitable when you are working in these kinds of contexts.

So Jon and Helen, this is for both of you. We have talked a lot about evaluations in different countries. We have talked about some wonderful examples. How do you make sure that all the research that is done in order to arrive at a set of evaluation findings is actually put into practice, put into policy?

I think that more and more, we are seeing learning being integrated into monitoring and evaluation. That is something that here at IMC we have taken very seriously and tried to invest a lot of resources into.

I think that you have to design a research uptake element into your evaluations. You have to think about it right at the beginning before you start your work because you have to think about these things as you do your evaluation: how do we make these findings useful?

How do we ensure that the right audience is picking them up and using them to inform policy or project design or re-design if it is an ongoing project or going to the second phase.

So it is not something that you can think of at the end and just say ‘Oh, how are we doing to use these findings now.’ I think you have to really design at the beginning and make sure that the learning is being used.

Some years ago, I did an evaluation with the Overseas Development Institute in London, and I was asked to look into ‘how do people learn things?’

They do a lot of publications by the way on guidelines and policy papers. Probably not surprisingly, we found that having done this survey of other organisations and managers that very few of them actually sat down with these papers and read them.

So when we said ‘Where do you actually get your learning from?’ they said ‘Well, from experience, from our own hands-on experience’, again, not surprisingly.

There is, however, now, a push to using evaluation not just as a paper to sit on a shelf, but actually, taking it one step further, which is to say ‘We have done the evaluation, here are the recommendations, now, you as an organisation have to follow up on those recommendations, and we are going to monitor it.

We are going say ‘What did you do with those recommendations? We are going to ask you in six months and twelve months time.’ And I think that is a good thing.

It means that evaluation has been mainstreamed in an organisation. It means that the organisation is cognisant of integrating evaluation into their day-to-day lives if you like. So it is a good move.

Unfortunately, there are still a very large number of organisations in which those reports do just sit on the shelf and are perhaps not thoroughly read by more than two or three individuals. But at least we are getting somewhere.

We are getting to the point where the evaluation culture of organisations is improving. And evaluation, as I said earlier on, the evaluation industry has become larger and more complex and more sophisticated as a result of this.

And here at IMC, where we both implement projects and evaluate them, we are trying to develop an internal system where we can learn from our evaluation findings, for example, to inform project design.

We have not completely finished this yet, but it is underway. And I think it is important to use these findings not just for donors or stakeholders but also for the firms who are carrying out these evaluations… they can use them as well.

One example on the Syria evaluation: the donor asked us to give a certain number of presentations on our findings.

So we have some presentations that were at a fairly low local level, looking at communities that we evaluated.

So those findings were interesting mainly to other people who were working in Syria maybe in the same communities or on similar types of projects.

But we also presented higher-level findings to the British government and to a number of different departments for people who might be working in Syria and in other countries on other programmes.

And they were fairly informal… we just presented for an hour and tried to get some discussion going, and I think things that might not seem particularly useful can actually get people thinking about lessons that they might be able to learn.

They might not a 50-page report, but if they sit in front of a Powerpoint for half an hour and have a chance to ask questions, that might even be more useful.

That’s a very important, actually: that evaluation serves another function. It’s not just about reporting and presenting recommendations.

It is also about creating the space for organisations to reflect on what they do. And it is surprising that in the day-to-day rush of what you are doing that you very rarely have that opportunity.

So when evaluators come in, they allow you to sit back and just reflect a little bit on the overall consequences of what you are doing. And many people appreciate that.

I remember not long ago, a senior person at DFID said that it does not matter as an evaluator if you are right, but that you raise the issue for us to discuss.

That was Jon Bennett and Helen Stevenson discussing evaluations in fragile and conflict-affected states. Thanks for joining us. If you’d like to listen to more IMC podcasts, check out our website on or find us on Soundcloud. Thanks again.

Most Read

April 24, 2018

We are pleased to welcome International Solutions Group to IMC Worldwide

The acquisition of ISG enhances IMC’s presence in the US and our monitoring and evaluation services.

December 7, 2016

Disaster management and development: two sides of the same coin?

Today, we talk with Rumana Kabir, IMC’s Senior Consultant for disaster management and climate change adaptation, with over 16 years’ experience in international development.

December 4, 2018

Kenya’s top 7 climate innovators awarded over USD 500K for initiatives to mitigate climate risks

We announced the seven winners of the Tekeleza Prize. Learn more about their innovative climate solutions.