A bit about Receiver Operator Curves and Cesarean Delivery
In a few posts I have mentioned Reciever Operator Curves (ROC), and a few folks have asked what I mean, so I want to explain it. This is an extremely important concept in medicine, and in decision making in general. Unfortunately, it is also quite complex. So complex in fact, that it is possible to explain an ROC in very high end mathematical speaking, such that few would understand (and yes, it can get over my head as well.) To see this kind of explanation, check out the Wikipedia entry on the ROC. But I want to try to make it a little simpler.
Let’s take the example we have been working with about cesareans for protracted labor, and see if we can think about an ROC for the decision on whether or not to do a cesarean. Consider two populations of women: 1) women who given enough time, will deliver a healthy baby and 2) women who bear a baby who will given enough time, will be injured in utero or will deliver vaginally but injured or dead. Now, consider the decision of whether or not to do a cesarean delivery. If this decision (the test) is to do a cesarean, we would say that the test was positive, and if the decision were to await a vaginal delivery, we would say that test is negative. A cesarean delivery for dystocia done in group 2 would be a correct decision (a true positive). A cesarean delivery done in group 1 would be an incorrect decision (a false positive). Waiting for vaginal delivery in group 1 would be the correct move (a true negative), and waiting for vaginal delivery in group 2 would be the wrong move (a false negative).
So what about ROC. ROC is a graph of the sensitivity of a test versus the inverse of its specificity. OK its getting confusing already, and that’s why ROC is a little hard to understand.
Sensitivity is the likelihood that the test will correctly identify those with a condition (likelihood that babies that need to be delivered by cesarean to be uninjured will get a cesarean), and specificity is the likelihood that the test will correctly identify those without the condition (likelihood that those who will eventually deliver vaginally uninjured will not get a cesarean).
Sensitivity and specificity of a test depend on the cutoff value that one chooses to put on the test – that is where the line is that defines positive versus negative. In a case like iron deficiency which can be defined objectively, we could have a objective cutoff like a ferritin of 100, and decide that those under 100 test positive and those over 100 test negative. Then we could compare those results to some gold standard, like bone marrow iron stores, and decide what the sensitivity and specificity were. We could then look at what they would have been if the cutoff had been 50. And again at 150. And again at 10, and then 20, and then 30, and so on. And when we graphed all those points, what would we have? A ROC!
In the cesarean example, it is a little more obscure, but in some ways more apropos to real medical decision making. In this case, the cutoff is not a objective value, but an internal thought process of how convinced we are going to need to be before we will take action. Are we going to do a cesarean at the first sign of trouble (way out towards sensitivity) or are we going to wait for a really terrible strip, or a woman arrested for 12 hours, before we go to the operating room (way out towards specificity.)
Ulimately the ROC describes just how good a test is. It describes the interplay between sensitivity and specificity, how much of one we have to give up to get some of the other. If a test is great, we may be able to get very high specificity and sensitivity at the same time. If a test is not as great, we may only be able to have one at a time, depending on what cutoff value we choose to use.
Here is an example of a ROC for a typical medical test, where sensitivity and specificity are traded for one another at different thresholds.
One can see that the closer that line hugs the left and top parts of the graph, the better the test will be; the more sensitivity and specificity one can simulataneously have.
So why does this all matter in the cesarean section case? Because it demonstrates that there is no absolute to these decisions. Some commenters have tried to make the case that many cesareans are unecessary, and they are of course correct. Some commenters have made the case that most cesareans are necessary, and they are correct as well. It all depends on where you put your cutpoint, and what the ROC for the decision looks like.
If our #1 outcome is to prevent any neonatal injury from intrapartum asphyxia and infection, we could do cesareans for everybody, at the expense of doing many cesareans that were not necessary. That would be running our setpoint all the way on the right side of the ROC. If our goal was to prevent every cesarean but the ones that were obviously necessary, we could run all the way on the left, doing the minimum number, but also failing to do cesareans for babies that might have ultimately needed them.
For this particular problem, we don’t know exactly what the ROC looks like, because we don’t have a gold standard test that can tell us what will happen to a baby if it is or is not delivered by cesarean. But this idea illustrates some of the difference between me and the OB/GYN commenters and some of the midwifery and doula commenters. OB/GYNs tend to run their setpoint further to the right on the ROC, and midwives and doulas prefer to run further on the left. OB/GYNs go for sensitivity, while the midwives and doulas go for specificity.
In complex decision making, this idea is crucial. Any time you change your decision threshold, you will trade sensitivity for specificity. To make a good decision, one has to be honest about what one fears most: a false positive or a false negative. In OB/GYN, we fear the false negative of the baby that needed a cesarean, that we failed to perform. And that’s why we tend to run to the right. Perhaps a little too far, as I have suggested before.