vrijdag 17 november 2017

Of tadpoles, newts, and forced prostitution

Human trafficking reporting in the Netherlands has been at a stalemate for a long time. Numbers of presumed victims of trafficking were collected by a National Rapporteur on Trafficking in Human Beings and Sexual Violence against Children, who was appointed by law to be the official source for state policy purposes on the subject of human trafficking. These numbers were always hobbled by the fact that they were uncertain, and there was always question as to the value of the estimates derived. Then at the end of september 2017, the National Rapporteur unveiled a bombshell report that announced that finally, the numbers had been settled:
The ability to reliably estimate the actual numbers contributes greatly to the fight against human trafficking. This was also emphasized in the High-Level Meeting on human trafficking with UN-members on 27 September 2017. The Dutch National Rapporteur on Trafficking in Human Beings and Sexual Violence Against Children has been analyzing the numbers of registered victims for years. Thanks to the relatively efficient registration in the Netherlands, it has been possible to reliably estimate the actual number of victims. Dr. Maarten Cruyff and Prof. Dr. Peter van der Heijden, experts in producing estimates, have further developed an existing method of estimation. Specialists in human trafficking, Prof. Dr Jan van Dijk, the Dutch National Rapporteur and UNODC together applied this method for estimating trafficking in the Netherlands. As a result, the Netherlands gained insight into the actual extent of trafficking in human beings. The implications of these new numbers will be addressed in the Monitor Human Trafficking that will be published on 18 October 2017.(1)
[emphasis mine]
Too bad these numbers are garbage, as will become clear.

The report hailed as the "reliable estimate" of the extent of human trafficking in the Netherlands is named "An estimation of the numbers of presumed human trafficking victims in the Netherlands"(2), and was released jointly by UNODC and the Bureau of the Dutch National Rapporteur. Critical examination of this report shows very little ground for the boastful claims by the Rapporteur.

The method employed in this report to estimate the number of victims of trafficking is called "Multiple Systems Estimation" [MSE]. This method has been used to estimate numbers of people infected with HIV, Hepatitis, or afflicted with neural tube defects, and to estimate the total number of deaths due to conflict in Guatemala, Colombia, Kosovo and Peru. In all these cases, the real number is unknown, but several surveys have been performed to try to chart the population - in the latter cases of the dead. Via MSE, the different lists of discovered individuals, and in particular their differences and similarities, can be used to make an estimate of the total population, including the part of the population that has never been surveyed.

The basis of MSE lies in a much older technique, employed in wildlife surveying, called Capture-Recapture. This method takes a closed habitat [a pond for newts, for instance, or an island for iguanas] and catches a number n of animals. These animals are then marked, and released back into the habitat. A second survey is then made, catching K animals. Of these, a certain number k will have been caught in an the earlier survey, and carry a mark. If we can apply a set of assumptions implying that the relations between the seen and unseen groups will be in proportion in both surveys, so that we can claim similarity between the resulting survey lists, we can then reason that the proportion k/K will be close to the proportion n/N, with N the total population. An estimator for N would then be
[The 'hat' diacritical indicates something is an estimator. There are more complex, more accurate estimators, but for the purposes of this article the simplest approach suffices. Higher accuracy estimators obscure the mechanisms at work by added complexity, without changing the underlying method or the outcome by more than a small fraction.]
Basically we're mixing into the total population a known quantity of marked animals, and then measure in our second survey what the concentration of marked animals is.

MSE is the same principle at work, with the same assumptions and logic behind it. What makes MSE different, is that more than two lists of survey results are used. If there are multiple lists, there is more information to be used towards making a thorough estimate. More lists, more datapoints, less chances of a freak accident in sampling offsetting the data and skewing the results of the estimate. Most importantly, the lists can be compared to each other to find the variability and homogeneity between the lists. The basic concept is the same, but the computation becomes much more complex. The approach taken in this case is to use a maximum likelihood estimation method: to find the number N for which the overlaps between the lists are least unlikely. This is only doable by computer. Unfortunately, that means that the actual computation happens in a proverbial black box, and we can't see the math at work.

The clearest perspective to understand the mechanics of the method is to take the total of all lists and view it as a collection of sets. The sets are defined by their inclusion or non-inclusion in all considered lists. With n lists, there are therefore 2n sets to consider. The interrelation between these sets can be computed, because all sets are known, except for the set that is not included in any list: the 'missing' set, which we are trying to estimate. If the distribution in the lists is homogenous and independent, we can estimate the interrelations between categories of sets to also hold for the 'missing set', the value of which can then be inferred.

This is a very short overview of the background of the method, and for the layman who wishes to know more about this method I recommend Counting Civilian Casualties: An Introduction to Recording and Estimating Nonmilitary Deaths in Conflict by Taylor B. Seybolt, Jay D. Aronson, and Baruch Fischhoff.
People with a little mathematical background will be informed more summarily by papers like Multiple Sample Estimation of Population and Census Undercount in the Presence of Matching Errors by Ding and Fienberg [http://www.statcan.gc.ca/pub/12-001-x/1996001/article/14385-eng.pdf] or An exploration of multiple systems estimation for empirical
research with conflict-related deaths
by Kruger and Lum [http://visionsinmethodology.org/wp-content/uploads/2015/05/KO-MSE.pdf]

This method rests on several assumptions. If any one of these assumptions is violated, the method derived from them is invalid, and the results are meaningless.

The least problematic assumption is independence. This is the assumption that inclusion of an individual on one list does not influence his probability of being on another list. To infer the value for the missing set requires that it adheres to similar distributions as the visible sets. If there is no independence, and one list inherently influences another, the interrelation is no longer inherent to the data set, and the estimation is no longer based on measurement of properties of the population alone. You're no longer measuring the target quantity, you're measuring your measurement.

For example: imagine we have a pond containing 100 newts. We survey by trapping 50 newts in traps baited with a specific food item. This trapping is so unpleasant that those newts now avoid the bait foodstuff. In our second list, collected with the same method of capture, we therefore only trap 1 newt captured before out of 50 newts total in the second list. The resultant estimator for the total population would then be 2500. On the other hand, if the traps were experienced as a good feed, and the nice experience makes them more likely to repeat the behavior that got them trapped in the first place, they would act opposite. If then all but one are re-captured, the estimate for the total population would be 51 newts.

The reason this is the least problematic assumption is that in MSE, especially MSE with many lists, the independence violations can be analyzed from the many overlapping interrelations in the lists and compensated reasonably well. Computer programs are available to infer relations between the sets, which rectify the dependences between the lists. These rectifications model real-life relations however, which should be investigated to check if they are accurate. The resulting models may be used to nudge the lists further into agreement. Of course, this treatment of the data must be done with utmost care. It is very easy when you mess with the input data to achieve a desired conclusion by introducing error. Sceptical review of the modeling, and what it means in the real world, is therefore of the essence.

This modeling is done in the report in different ways, with different mathematical degrees of freedom. These degrees of freedom are mostly set for computational convenience. For example:
the 3-factor interaction parameters for such pairs of registers and the variable Q with the five polynomials for modelling trends over time are numerically unstable in the sense that they lead to highly inflated population size estimates and confidence intervals. To protect against this type of inflation, the restriction that only 2-factor interactions are allowed to enter the model was imposed. This restriction implies that no corrections are made for possible higher order interactions between the lists.(2)
In the report, the modeling of dependencies is done by trial-and-error. No attempt is made to match the actual modeling with real-life behaviors or phenomena. Only complexity and numerical stability of the model is considered. The different models tried show radically different outcomes, some mutually exclusive, which demonstrates the huge sensitivity of MSE to inaccuracy in the assumption of independence:

Table 2: Results of log-linear modelling of six lists of human trafficking victims in the Netherlands, 2014
Estim Confidence Interv. Pearson p
M1. P,R,K,O,I,Z 10,542 ( 8,802 - 12,956) 577 .007
M2. P,R,K,OZ,I 15,711 (12,552 – 20,576) 226 .017
M3. P,R,K,OZ,IZ 17,812 (14,026 – 23,874) 66 .130
M4. R,K,OP,OZ,IZ 22,270 (16,871 – 32,275) 49 .175
M5. K,PR,OP,OZ,IZ 32,646 (22,299 – 56,048) 46 .173


It isn't strange that the MSE estimate of Dutch trafficking figures is so perturbed by analysis of the interdependence of the lists. For instance, National Police routinely turns presumed victims of human trafficking over to aid organizations, who then re-report the victim as a discovered presumed victim of human trafficking. This causes a large overlap between some lists.

The second assumption underlying MSE is homogeneity of the investigated population. For each individual in the investigated population, the probability of inclusion in one list must be equal to the probability of inclusion on another. Assumptions of homogeneity can easily be invalidated by non-homogeneous or non-random sampling and re-sampling. Surveys set up in a limited domain can easily underestimate the total population, whereas spot checks on different domains can easily overestimate.

In our newt example, we can illustrate this assumption. Let's return to our 100-newt pond. We catch 15 newts, mark them and release them back to the place we caught them. Imagine that the newts don't travel much in the pond, and stick to one place. Then, if our second capture expedition is to the same spot in the pond, we will have a high number of marked newts in our second list. We might catch 10 newts, 9 of which turn out to be marked. This makes our estimation 17 newts. If our second trapping area is on the other side of the pond, hardly any newts on our second list will be marked. We might catch 10 newts, of which only 1 has been marked. The estimator would then be 150 newts. The population of the pond didn't change, but our estimate changes dramatically.

Once again, with multiple lists the MSE method can work around this limitation to some extent. As long as the lack of homogeneity can be deduced from either the lists or from external investigations, compensation can be performed. This requires that the heterogeneity is not so large that it dominates the outcome, and that it is well-charted. These two requirements are very hard to meet in an environment like the hunt for human trafficking victim, and almost impossible to verify.

The report considers homogeneity only as a problem of equating the multitudinous forms of human trafficking in the Netherlands, and pretends to solve the problem by splitting out the lists by type of population. This assumption is therefore not actually addressed, and quietly assumed to have been met. Considering the complexity of this assumption, this doesn't suffice to make the report credible. However, it is impossible to say what the amount of heterogeneity is in this case.

The third assumption that must be met is that the population under survey is closed. Closure means that between the generation of the various lists, there is no 'escape' from the population. If the domain to be sampled is not well-defined, it isn't possible to properly sample a single population and correlate the lists. In effect, you end up sampling 2 different but hopefully overlapping populations.

Our example pond of 100 newts is so handy because a pond is a closed system. Newts rarely leave or enter, so the system is pretty much closed. Still closure errors can be made. For instance, let's say we do our first survey and mark 50 newts. Then we come back 10 years later for the second survey, catching 50. Because most newts live for about 6 years, we find only 1 newt on our second survey. The resultant estimator is that 2500 newts live in the pond. We reach an estimate 25 times too high, and now because we ignored the closure requirement.

In the report under discussion, closure is violated. The lists consist of disparate collections of observations by different agencies sampling disparate subsections of society. There is not one population that is carefully, randomly and completely sampled, but rather a few separate, not necessarily intersecting domains. People move in and out of those domains, or bide for a while. This also means people can be counted multiple times over time, or not enter the domain shared by all lists at all. The report ignores this issue.

The previous three assumptions undermine the credibility of the National Rapporteur/UNODC report, but if they were the only problems the report could still be salvaged, if pared down to a shadow of itself and reworked to be a lot more humble in its scope. The fourth assumption however completely invalidates the whole thing.

The fourth assumption is the assumption of an error-free dataset. MSE is basically an extrapolation from existing data, and translates the observed data into an estimate by taking differences in observations between lists to be indications of missing observations. Erroneous observations are uncorrelated, and unduplicated observations increase the estimate. Most MSE applications have little trouble with this requirement. The amount of false positives in modern infectious disease diagnostics is not too high to give much trouble. Corpses of Peruvians and Guatemalans don't suddenly turn out to be alive.

The method is quite vulnerable to error. It contains no error detection, and wrong data doesn't jump out at you. Errors don't cause weird outliers or kinks in curves. An error isn't recognizable by its deviation from the modeling. It just skews the estimate. Especially if the true positives are scarce, even a modest false positive rate will quickly dominate. For instance, a 99% correct test which only yields 1% false positives, when applied to a population with 1% true positives, will give equal numbers of false and true positive results. In the case of the report under scrutiny, where nets are cast wide across all travelers, all sex workers, all immigrants and all youth to catch those few victims, this is a relevant consideration.

False positives act in two ways: They increase the amount of "confirmed" individuals considered "found", and their absence from other lists make them suggest a higher proportion of the population is unseen. As such, errors in the input show up in second order in the output. Twice the error rate in the input quadruples the error rate of the output.

This sensitivity to list error cannot be computed away. As the adage goes: Garbage in, garbage out. There is no compensation like there would be for a lack of independence or inhomogeneity. In fact, corrections for inhomogeneity or dependence are liable to throw off the end result even further if the lists are polluted with errors. If one list contains significantly more ghost individuals than the others, it will suggest a lack of independence among the lower pollution lists, which will then be compensated mathematically, adapting the non-erroneous data to the error. This can either lower or increase the estimate, depending on the implementation of the dependence modeling. The same goes for inhomogeneity correction.

Especially if there is a significant proportion of false positives, even false negatives increase the estimate unduly. This seems counterintuitive, until you realize that false negatives, especially if they are counted on one list but not all others, decrease the overlap between the lists, thus inflating the estimate.

As an example, consider our newt pool again. As before, it contains 100 newts. It also happens to contain half a million tadpoles. Instead of trained biologists, the newts are now counted by enthusiastic but unskilled school kids. They can't accurately discern tadpoles from newts. They catch and mark 25 newts and 250 tadpoles. On the second survey, they catch another 250 tadpoles and 25 newts. 6 newts and 1 tadpole have been marked. We observe that if the tadpoles would have been discriminated out in both surveys, the estimate would be 104, quite close to the real number. However, because of the false positive tadpoles, the estimate based on these lists is 10804. Now if one of the newts from the first survey would have been rejected in the second, thus introducing just one false negative, the estimator would become 12604.

The newt pool example is a easy and clear example because it is so extreme. The estimate becomes hugely overblown, because the false positives outnumber the real results by a factor 10. The resultant error in the estimator is therefore around 100. Surely, surely that is a great exaggeration? Surely that is not even close to what is happening with the human trafficking numbers in the report under scrutiny?

It doesn't take much investigation to discover that the dataset used for this report is not free of errors. The lists in the analysis are supplied by the Dutch Non-Governmental Organization "Comensha", which collects and tabulates any "signal of trafficking" sent to it. Signals are supplied by several channels, which are treated separately next.

The healthiest information is supplied by the ministerial inspectorate for social and labor affairs, the "Inspectie SZW". This institute was set up to ensure proper treatment of workers, and to check adherence to labor laws. This is a benign organization which shows no agenda to inflate numbers. It is subject to the same guidelines that all the supply channels are charged with by the National Rapporteur. These guidelines compel reporting any individual as a presumed victim of trafficking at the slightest signal. However, since the Inspectie SZW does not deal with stigmatized groups, the list of signals they are supposed to report on is based in reality.

Not so with police. Especially vice police is very eager to hunt for false positives. Culture within the police force views all sex work as demeaning and crime related. Vice police raids and inspects sex workers aggressively, actively searching signals of trafficking, raiding for only that purpose without probable cause. The list of signals for the police is extensive and eagerly used. Signals of trafficking include being from eastern Europe, having lots of money, having no money, having new friends, having few friends, coming to the defense of ones employer, refusing to answer questions by police [vice police tend to ask impertinent questions, for example: "do you cum when your boyfriend licks you out?", which is supposed to ferret out sham love relationships, which are considered coercion], being housed by the employer, not having arranged ones own journey to the Netherlands, having clothing or condoms supplied by a third party, sleeping at the place of work, having a body guard, and a slew of other vague signals. Sex workers try to avoid giving any of these signals because apart from being listed as a presumed victim of trafficking, signals of trafficking give police the authority to perform invasive "investigations" into her situation.

The same goes for the Marechaussee, the paramilitary police that performs border control duties. Their findings have been excluded from the report because they report mostly on a behavior that has been explicitly declared "not trafficking" by the Dutch Supreme Court, even though it was included in Dutch trafficking law.(7):

Also Immigration services report to Comensha. They process immigrants, who are invited to volunteer stories of trafficking. Immigrants who don't stand a chance to be allowed to immigrate are told that if they report being trafficked, they will be allowed to remain in the country for the duration of the investigation, and if a conviction results - or if the amount of time the immigrant remains in limbo exceeds five years - they will get a permanent residence status. As can be expected, this regulation is vulnerable to abuse, and delivers a steady stream of presumed victims.(3)

Another source is the flourishing industry providing services to victims of trafficking. Tens of millions of Euros are turned over by companies supplying shelter to or rehabilitating trafficking victims, willing or unwilling. This industry receives most of its victims from police. However, also a surprising number of parents decide to submit their child to these companies when they suspect their child is under the influence of evil "loverboys", young men charming innocent girls into a life of vice. This rescue industry reports to Comensha, and has no problem overreporting. Most of these companies are also lobbyists against "sexual transgression" like sexting, teen sex, porn, and sex work. Overreporting is part of their lobby.

A further channel of counts of presumed victims is via health care and youth services. Healthcare services are supplied with the same hypersensitive signal sets used by police to report on suspected trafficking. An example would be a high-profile case in Utrecht, where healthcare professionals broke their professional confidentiality to report that a prostitute was pregnant while still working, and was therefore a presumed victim of trafficking. Another report showed a prostitute had reported with a broken ankle, and was therefore a presumed victim of trafficking. That this was the same woman didn't preclude reporting two presumed victims. Youth services report any unruly or runaway teen who has a boyfriend, especially if the boyfriend is black or muslim, as a presumed victim of trafficking. There are also parents who think their daughter is under the spell of a trafficker when she is displaying unmanageable behavior, because it can't be their upbringing at fault. These girls are also reported by youth services as presumed victims.

The final channel is mentioned here only because it is separately listed in the report, and that is the regional coordinators on human trafficking. These only aggregate regionally collected reports on presumed victims, and do not separately collect signals.

Notably absent is a listing of people coming forward of their own volition to report as a victim of human trafficking. This is likely because in most cases, the presumed victim disagrees about her victim status. They don't consider themselves victims because they chose to do the work under the conditions offered. They consider it a good deal, and prefer it over the alternatives. However, whether they are voluntary workers or coerced victims is not up to them; human trafficking is a crime that needs no charges by the victim, or even agreement from the victim about the victims' victim status. The prosecutor decides for the victim whether she's a victim, and whether what she chose to do was voluntary.

This is a repetitive theme in human trafficking literature. Victims don't come forward. Victims delude themselves into thinking it's what they want. Victims don't understand victimhood isn't the better option and don't want to be rescued. Victims are so destroyed by their victimhood that they are no longer capable of deciding what is their own will. Frustrated pollsters have to use "specialized" questionnaires and questioning techniques to trick presumed victims into "admitting" they are a victim, and even then avoiding the word "victim". The low incidence of people reporting to be a victim is a thorn in the side of the rescue movement, and their expectations of waves of human misery must be justified by using very questionable and misleading interrogation techniques. For the greater good, of course.

Because of stratification of victims in the report, which is somehow meant to make the report more "robust", we can immediately see some of the failures of the application of the MSE method. For instance, male exploitation outside the sex industry is estimated with a relatively low dark number. This arises from the lack of victim hunts, because of the limited interest for these non-pornographic stories. Once labor trafficking victims come forward, they make no effort to hide, and turn up on all lists available to them. But they have to be in desperate straits before they come forward, because police and rescue industry have little to nothing to offer them. They can hope for little more than prosecution of those who wronged them. We therefore should expect few to come forward, and a large dark number. MSE sees all reported individuals show up on many lists, and estimates low.

On the other hand, we have the group of minor female victims exploited in the sex industry. This consists of claims that girls are under the influence of "loverboys", psychological wizards charming hapless girls into prostitution. This category of girls is notably absent from the lists of police surveys of the sex industry, and from lists of prostitution aid facilities. The logical conclusions is that most of these presumptions of victimhood must be duds, because the girls are hardly ever actually found in their suspected sex work roles. But MSE just sees the lack of overlap between the lists, and explodes the estimate. It estimates huge numbers of invisible child prostitutes, forever out of reach of police.

Finally, the MSE method rest heavily on individuals in the various lists being matched or unmatched in other lists. The collecting agency Comensha, however, has no trouble listing individuals whose identifying information is incomplete. Exactly how they register and tabulate the data is a matter of some mystery, because there are conflicting statements by different employees, but given the vagueness of much of their statistics, and their claims not to register "personal information" which would conflict with Dutch privacy law, there is a subset of individuals of whom matching in other lists is unclear, thus again likely driving up the estimates.

With many services reporting at the drop of a hat, using a system of oversensitive rules that is biased towards gross overreporting, it isn't strange that there must be many false positives. The challenge becomes to estimate how large the overreporting is.

Each of the presumed victims is counted through direct contact, and is therefore available to police to act upon. If there is actual evidence, then a court case might follow, which in turn might result in a conviction. The number of convictions and court cases is recorded, and allows for an estimate.

It is reasonable to assume that police will take action on those cases that are the most serious, and have the clearest evidence with the highest probability of conviction. This would be the most efficient application of their resources to stem the most crime. Therefore, it is reasonable to view the cases brought before the court as the most heinous, and the ones with the most solid evidence.

It should be noted that human trafficking cases are treated by special courts. What makes these courts special is that the judges have had special training to become experts in the trafficking lore that is dogma in trafficking lobby circles. This was done explicitly to have judges view evidence in the light of many dogmatic presumptions about the situation of sex work and other trafficking backdrops, so that they would have lower thresholds of evidence to convict. To prove trafficking has been made very easy indeed.

This is illustrated by a court case from 2014(4), where the Court of Appeal overturned a trafficking conviction in first instance because the original verdict was based only on the contradictory statements of the purported victim, without any other evidence. This is no exception. Convoluted suspicions of organized groups of criminals making girls fall in love with them to exploit them sexually in Dutch prostitution, and committing hideous acts of coercion and exploitation upon them, are considered proven by producing snippets of snooped telephone calls, or by fragments from months of interrogations of the victims. Whether or not the victims are willing to act the part of victims.

Convictions become even easier when minors are involved. A minor in sex work is always considered coerced, no matter any evidence that she herself took initiative and did the work voluntarily. She is automatically a coerced victim, even if there is nobody in sight to convict for the coercion. It's enough for a trafficking conviction to find a minor prostituting (and how easy this is, is borne out by recorded minors in sexwork being found in a matter of days from starting operations) and showing that someone facilitated her. Automatic conviction for forced prostitution.

Even with these special courts, there is only a 70% conviction rate. This seems respectable, until one considers that the usual conviction rate in the Netherlands is 90%. The courts are usually a police-to-prison pipeline. Acquittals in human trafficking are therefore at triple the usual rate. Prosecutors blame this on the extreme difficulty of proving human trafficking, even though the threshold for coercion has been lowered to "has a boyfriend" or "comes from a less affluent country".

Considering all this, it is interesting to note that there are 10 times as many presumed victims than there are human trafficking court cases. In 2015, over 1321 presumed victims, only 139 convictions resulted. The rest was even weaker than what is already tried. At what point should we call a presumed victim a red herring?

One can also wonder how many of these convictions are accurate. Quite apart from any discussion which behaviors should or should not be punished, and quite apart from the discussion how harsh this penalty should be, the decision to consider a case proven could be examined. When reading verdicts it is remarkable how far judges stretch interpretation of vague items of evidence to justify a conviction. Prosecutors speak of worrying trends in "beautiful convictions" getting overturned on appeal.(8) This happens in highly hyped cases that were brought out in the press in the trial in first instance. Something is rotten there. Which proportion of cases should be considered overturned is unclear, because appeal cases can drag on for years, the rules change every few years, so data on original verdicts issued in 2015 will not solidify this decade. Older cases don't offer solace, because partial acquittals are not recorded as acquittals in reports by the National Rapporteur, and compiling ones own index of acquittals on appeal requires cooperation from the Justice department. Quite apart from this, many accused will not want to go through an appeal procedure, to avoid further pillorying in the press and years in limbo. However, we will not pursue this line any further because we simply don't have the numbers. The report gets a free pass on this one.

To treat vague signals as if they were confirmed cases of human trafficking is misleading. Suspicions must survive Occams Razor to be taken seriously. Considering the lower conviction thresholds for trafficking cases, the low conviction rate can be generously said to stem from the good cases being used up, and bad cases without a chance of success being tried against all odds. It implies we've seen pretty much all cases worth a trial. In other words, the best cases are being tried, listed from best to worse, and as the viable run out, even nonviable cases get sent to court. If instead the limiting factor would be police or court capacity, as is often claimed by the Rapporteur, we would see a higher than average conviction rate, as police only has capacity for the cream of the crop, and doesn't get around to the questionable cases.

If we take all this together, we see 139 cases with any chance of conviction, and the rest of the presumed victims don't come with enough evidence to meet even the lowest standards of proof. That would mean that only 10% of signals at the very most should be considered valid, even when viewed through the heavily biased eyes of the modern human trafficking courts, even when pretending that overturning in appeal never happens.

It's always important in mathematics to keep track of what the quantity you're working with represents. If we're starting our analysis with one quantity, it doesn't magically change into another quantity through the computation. The numbers we have available are events which lead to a conviction in Dutch trafficking courts, and do not metamorphose into actual cases of trafficking by any other standard.

Now as a gauging exercise, we can try to very roughly estimate how wrong the report is, based upon this derivation. We ignore all fitting and adjusting. Because of reasons given in earlier paragraphs, we may assume this to be erring on the side of caution, since we are allowing all modeling applied to get a "desirable" result. We know the error rate in the estimate is the square of the error rate in the input data. We can see that the input data is systematically 1321/139 = 9.5 times too high. In first order, this means that the estimate would be 90.3 times too high. If applied to the number of trafficking cases per year, claimed to be 6250 by the National Rapporteur, the result would be a tepid 69 cases per year. That's about half of the actually convicted cases. Since we are estimating the number of events which would lead to a conviction if spotted by the authorities, this is an obvious underestimate.

But let's take a look a little further. As can be read in the report, in 2014 an earlier attempt at MSE estimation had been made. This didn't make much of an impression, because the numbers it generated were too large for the tastes of the Bureau of the National Rapporteur, so it was quietly ignored. Not because they balked at unreasonably high numbers, but because they would look silly claiming that 140% of the estimated number of sex workers was being coerced. Let's take the 2014 estimate into consideration instead, where far fewer MSE-related adjustments and compensations had been applied. This "more pure" MSE application estimated between 14000 and 23900 victims were out there. Applying the same haircut to this earlier pure application of MSE, we get an estimate of 155 to 265 victims total. These numbers are far more believable; between 53 and 90 percent of actual cases are prosecuted and lead to convictions.(5) That is what you would expect for a crime which harms the victims a lot, and is hunted with great policing effort and great police powers.

What has been shown here is a crude comparison, a quick and dirty manipulation of the results based upon some basic mathematics of error dynamics in MSE. It is merely meant to illustrate the flaws and inconclusiveness of the report under scrutiny. The clunkiest modeling available is used, but the principles hold true also for more refined and complex modeling. Using R packages on dummy datasets doesn't show significant deviation from these basic approaches. Basic enveloping math always works, no matter how obfuscated and overly complex models get.

Consider for a moment what establishing a number of victims would be good for. It is currently only used to cause shock in the press. In years that the number went up, that was cause for concern, and government was urged to ramp up the fight against trafficking. In years when the number went down, that was cause for concern, and government was urged to ramp up the fight as well. The actual numbers make no material difference.

No matter how you view it, this report is in no way a breakthrough. It isn't even useful. Tabulating how many victims there are is already of no utility, but if the data is made useless for any conceivable genuine purpose by introducing so much noise to the signal there is really no point to the entire effort. Extrapolating from this senseless data set is just masturbation. Mathmagicking garbage data into breakthrough truth doesn't happen. If the numbers were indeed important, the effort would be to find the right level, not just jack them up. If a high but incorrect number is a great thing in itself, why not just declare the entire Dutch population victims and be done with it?

One does not push the collection of polluted, inflated, biased data for years if the purpose of the collection is to use a modeling tool like MSE to find the sober truth that has been hidden by all the pollution. The MSE operation was meant to push the same line as the original collection of data. The entire reason for the collection of these pointless high numbers, the reason the bureau of the National Rapporteur even exists, is to lend credibility to a moral panic. Mull over the following quote, for instance:
The number of reported victims of human trafficking has decreased greatly over the past five years, from 1287 in 2012 to 952 in 2016. In 2016 the count decreased by 17 percent from 2015. National Rapporteur Corinne Dettmeijer: 'I seriously worry about the decreasing number of reports. Human trafficking doesn't, indeed, decrease: we now know that the number of victims per year is 6250. That means that an ever greater proportion of human trafficking remains out of view. This should occasion the police to release extra capacity to combat human trafficking.'(6)
The law is a tool in the hands of people who want to steer society to conform to their ideals. These people can use the law to repress phenomena in society that they don't like. Whether it be people who find sex outside boundaries that they are uncomfortable with icky, or people who want to please their God, or people who just want to feel like the savior or bane of some poor immigrant, the law is the most powerful tool for the self-righteous. Human trafficking policy is their playground. It can be invoked against immoral sex, immigrants, capitalism, the shadow economy, and other liberties that are viewed with suspicion. That means very many activist groups can get behind laws repressing human trafficking, even when they have very little eye for real situations where people are being coerced and exploited. Whether they do it out of conviction or to make money off the whole circus, they are willing to make up whatever story is necessary to get their way.


(1): https://www.dutchrapporteur.nl/current/news/reliable-estimate-reflects-true-numbers-of-victims-of-human-trafficking.aspx
(2): https://www.nationaalrapporteur.nl/binaries/An%20estimation%20of%20the%20numbers%20of%20presumed%20human%20trafficking%20victims%20in%20the%20Netherlands_tcm23-282232.pdf
(3): See for instance https://www.mensenhandelweb.nl/taxonomy/term/2364
(4): https://uitspraken.rechtspraak.nl/inziendocument?id=ECLI:NL:GHARL:2014:3064
(5): This is a comparison of an MSE estimation from 2014 to conviction data from 2015. Not strictly correct, but enough for a decent ballpark estimate.
(6): https://www.nationaalrapporteur.nl/actueel/2017/minderjarige-meisjes-vaker-slachtoffer-van-seksuele-uitbuiting-dan-gedacht.aspx
(7): https://uitspraken.rechtspraak.nl/inziendocument?id=ECLI:NL:HR:2016:857
(8): https://www.trouw.nl/home/in-de-strijd-tegen-mensenhandel-is-elke-veroordeling-een-overwinning~ac0ffa45/

40 opmerkingen:

Anoniem zei

Who wrote this?

asynto zei

Een kleine inleiding zou fijn zijn.
Wie schreef het, waarom plaatst zondares het, liefst nog een samenvatting in het Nederlands.

Dan weten we of het de moeite loont dit te lezen.

Anoniem zei

Dit kostte even tijd om te lezen, maar heel goed gedaan. Treedt te veel in detail voor je bredere lezerspubliek, maar je moet sommige dingen niet te veel vervlakken. Ingewikkelde materie inzichtelijk maken is een kunst, en die versta je.

Zondares zei

Bij het uitkomen van het rapport van de Nationaal Rapporteur was er veel gedoe in de pers over dat we nu eindelijk de echte cijfers hebben. Ik ging er meteen achteraan, maar de achtergrond van de cijfers was echt te ingewikkeld voor me. Dus ik heb er mijn statistiekmannetje naar gevraagd. Het was heel lang werk voor hem, en hij kwam hiermee. Ik vond het iets wat wel op mijn blogje thuishoorde.

Anoniem zei

I normally don't reply. I read your blog via Bing translation, but your English is excellent. I keep track of Dutch prostitution policy only for the influence it has on German legislation. This is a very important article when this story will also come to Germany. Thank you.

Anoniem zei

Ik ben bij de formule af gehaakt. Zeg gewoon wat je vind, het is gewoon ook een mening zonder formule.

GGH zei

Fine work, but we needed this last week.

Anoniem zei

Fantastisch! De wiskunde zal voor de meeste mensen vast te hoog gegrepen zijn, maar ik vind het echt leuk. Er is ook geen speld tussen te krijgen. Met meningen heeft dat niets te maken. De conclusie heeft te maken met motivatie en dat is natuurlijk wat meer speculatief, maar dat doet niets af aan de analyse van het rapport. Die staat als een huis.

KSP zei

@GGH:
If you were so keen on it, you could have gotten your own mathematician to do the work. We're working with what we have now. Don't shit on what you get given.

GGH zei

You don't need to be that way. We had to approach the panel now without anything to counter the science part. We don't have a mathematician like you do. You guys kept promsing.

KSP zei

@GGH Our mathematician delivered. This just takes time to do right. You've got no idea how much work this is. That you didn't get it when you wished for it doesn't mean we've done anything to get in a huff about.

GGH zei

We're supposed to be on the same team and we're working for you too, so chill out.

Anoniem zei

Technically very well researched. Very carefully chosen criticisms. Mathematically sound, but very readable. Very well done.

Anoniem zei

If u make a link by copying from the right, u put the whole link in google translate u can the whole posting in one time translate in your own language. The translation to dutch go quite oke, because the language is quite simple with not too krezi creative words or poetic descriptions. Wenn you first read it in that way and then the english version it is easyer for the brain, although for my brain.

Hey Zondares, sla maandag maar een keertje over met een posting, geniet maar ff lekker van wat rust .......... hier zijn we wel ff mee bezig. What the fuck?

Who said hoo’s are stupid?

Anoniem zei

Real mathematical, but kinda misses out on emotional levels.

Anoniem zei

Veel te lang. Neem een voorbeeld aan de enige echte Felicia Anna, die een eigen website heeft waar ze je alles verteld wat je moet weten.

Anoniem zei

Ik had tegen nieuwsstukjes gestemd omdat ze zo saai zijn, en dit is het saaiste nieuwsstukje dat je ooit hebt geschreven.

Ook een wiskundige zei

Eerst je stukje even gelezen, en daarna even gepuzzeld of het nou klopt. Het is kort door de bocht, en er valt hier en daar wel wat te verbeteren, maar verdomd degelijk werk voor iets voor het grote publiek. De hoofdzaak van je stuk blijft te verdedigen als ik kritiekpunten bijzet. Het eind waarin je de motivatie van de rapporteur aanklaagt vind ik minder sterk. Voor dat soort claims is bijzonder bewijs nodig.

Anoniem zei

Prima werk!
Fantastisch uitgelegd!
Hulde!

Emil zei

Goed geschreven, prima uitleg.
Vooral duidelijk gemaakt dat de rapporteur ingewikkelde wiskundige modellen gebruikt, maar dat je met simpelere modellen al kunt laten zien dat het niet kan kloppen.
Groeten van een data-analist

Anoniem zei

Elegant gevonden. Je geeft genoeg aanknopingspunten om het diepere (ML etc.) uit te leggen voor de kenners, maar je houdt een flow die leken nog kunnen volgen.

Het was net iets sterker geweest als je die ene stap extra had gezet en noot 5 kon elimineren.

Anoniem zei

Strictly speaking you don't use enveloping, you use first order covariation to set a ceiling. Your point is still valid but you shouldn't use the wrong name.

Anoniem zei

Just think: if it weren't for activists like the writer of this blog, the science community wouldn't even read these reports, much less criticise.

Anoniem zei

The report has been issued by UNODC. Nobody mistrusts the UN. There's nothing scientific criticism will do.

Anoniem zei

Veel te zwaar. Veel te lang. Dit is een karwei om te lezen, en niet te volgen zo ingewikkeld. Zeg gewoon dat het niet klopt. We hebben alleen de conclusie nodig.

Anoniem zei

Ik snap er geen kut va. Doe t dan te minste in nederlands.

Anoniem zei

First derivative of the estimator to the error ratio is huge for high error if you take simple explicit estimators like in Cho or Sanathanan. Same result for first difference in R.

Anoniem zei

Dit soort dingen is waarom je blog niet meer wordt gelezen. Droog en saai. Sex sells!

Anoniem zei

Ik weet niet voor wie je dit schrijft maar ik ben na een paar blokjes tekst afgehaakt.

Anoniem zei

Wat hebben die newts en tadpoles er nou mee te maken?

Anoniem zei

I would have approached it differently. Your explanation of the assumptions is too verbose, and could be omitted. The point is well made when only the error sensitivity is explained. The tedious and complex explanation of the several agencies does not contribute either.

Anoniem zei

Good clarity, well done. Mathematically sound, brought down to the simplest representation. Mixing it with politics however, not so good.

Anoniem zei

excellent piece, great example with the newts

Ted zei

I've said it before, a basic statistics course should be compulsory to take public office. Well done, and very vivid examples.

Edward K zei

The examples are really on point, and I like the consistent tone throughout. Well done. Do you have mathematical training?

Anoniem zei

I come from a statistics background, and though the terminology is a bit sloppy, I think it's a good read. Halfway through I started getting a sense that the writer was slowly and methodically tying a noose, and the drop came at the end. I don't agree with criticisms of the political content, since it's shown quite clearly how it's relevant.

Anoniem zei

Wauw, vernietigend. De onontkoombaarheid van de conclusie die wiskundig wordt getrokken. Een heel bijzonder artikel, Zondares.

Anoniem zei

Indrukwekkend hoor!

M. Carpenter zei

This is a stunning piece of work, worthy to be published in a serious journal. Have you considered getting this peer reviewed?

Anoniem zei

"Not what I would have written"