answers to those doggone thermal design questions
by tony kordyban
my boss thinks i need to put in a time-delay in our fan-cooled system, so that the fan will stay on for about five minutes after the power is turned off to the electronics. he claims that if we turn off the fan at the same time as the electronics, the component temperatures will "overshoot" because of "thermal inertia." i can't find "thermal inertial" in any of my old textbooks. is my boss nuts, or can my parts really overheat this way?
over-worked and passed-over in overton
your question seemed to have a simple answer. unfortunately, i had two simple answers in my head, and i couldn't decide which one was right. besides that, not being trained in psychology, i am not qualified to say whether your boss is nuts.
i never heard of "thermal inertia" either. i asked a representative sample (2 people) if they had ever experienced "overshoot" of component temperatures when they shut off the power to their system. the vote was 50% yes, 50% no, with absentee ballots yet to be counted.
that taught me that voting is a dumb way to decide anything important. but now i not only had to figure out an answer to your question, but i had to have a way to explain why two people would get opposite results.
i knew, in the depths of my hypothalamus (the part of the brain that senses body temperature), that a heat source like the die of a component could not increase in temperature once its power was cut off. if the fan were shut off at the same time, the die would cool more slowly, but it could not increase in temperature without more power. no overshoot was possible. case closed.
as soon as i thought the word "case" in "case closed," it occurred to me that the same might not be true for case temperature. the case, or surface, temperature of a component is not the hottest spot in the component. it is somewhere between the die temperature and the air temperature. just because power is cut off to the die, does not mean the case temperature won't go up, at least temporarily. heat stored in the die could conduct over to it. but heat also would continue to flow out of the case into the air, even if it were not moving, because of natural convection. suddenly i was not so sure of my answer. another disturbing fact is that it is much more likely you would measure case temperature during a test, instead of die temperature.
voting and gut feelings didn't provide an answer. i could have built a fixture and explore this problem experimentally. but that sounded too much like work, plus it would be hard to try lots of different power levels, air velocities and material properties, in case the answer depended on those things. next i thought of making a cfd model. that could work, but i can never figure out the transient menus in flotherm.
thank goodness i didn't need an exact answer for a real component. all i cared about was whether overshoot is possible or not. this problem can be adequately modeled with transient, one-dimensional heat transfer. that only needs a small spreadsheet. it's not accurate or detailed, but it is good enough to see if overshoot happens.
i made the simplest, one-dimensional component possible, which would still be able to tell the difference between die temperature and case temperature. it is shown in figure 1.
each layer of the component body has only a single temperature. i assumed all the heat from the die has to pass through the case to get to the air. i also simplified things by assuming the air temperature was constant. power is applied only to the die, and heat is carried away by convection only from the case.
all these assumptions make it fairly simple to write a set of energy balance equations for the two layers of the component body, both for the steady state when the power and fan is on, and for the transient situation when the fan is suddenly turned off and so is the power. i won't develop them for you here, mainly because you'll skip to the end anyway. but i've buried them is a spreadsheet that you can play with yourself.
it allows you to plug in dimensions and material properties for the die and case, power dissipation (when the power is on), and values for heat transfer coefficient for when the fan is on and the fan is off (please choose larger values to represent the fan being on, ok?) it gives as a result a table of die and case temperatures vs. time after the fan and power are cut off, and a graph plotting the same thing. have fun with it.
in the meantime, figure 2 shows one typical result.
it's based on fairly realistic values for a component, i think. figure 2 shows a pattern that stayed the same no matter what values i tried. the die temperature never overshoots. (as my hypothalamic reaction told me.) but the case temperature always overshoots. (like my second guess.) how much it overshoots varies a lot with the parameters you choose. but note also that the case temperature never goes higher than the die temperature.
that explains why two experimenters could get opposite results. if one was measuring die temperature and the other case temperature, one would see overshoot and the other wouldn't.
now, as to over-worked's question about his fruitcake of a boss: do you care if case temperature overshoots when you turn off the fan? i generally don't, but you might, depending on the kind of product you are designing. if you turn it on and off a lot during its life, and the overshoot is large and fast, that could create lots of repetitive mechanical stress on the components, which could shorten their life. my telecom customers never shut off power to my products once they are installed, so overshoot is not important. i worry about maximum junction temperature, and junction temperature (die temperature) doesn't overshoot.
so your boss has a valid point, even if there is no such a thing as "thermal inertia."
about tony kordyban
tony kordyban has been an engineer in the field of electronics cooling for different telecom and power supply companies (who can keep track when they change names so frequently?) for the last 20 years. maybe that doesn't make him an expert in heat transfer theory, but it has certainly gained him a lot of experience in the ways not to cool electronics.
he does have some book-learnin', with a b.s. in mechanical engineering from the university of detroit and a master’s in mechanical engineering from stanford. in those 20 years tony has come to the conclusion that a lot of the common practices of electronics cooling are full of baloney. he has run into so much nonsense in the field that he has found it easier to just assume "everything you know is wrong" (from the comedy album by firesign theatre), and to question everything against the basic principles of heat transfer theory.
tony has been collecting case studies of the wrong way to cool electronics, using them to educate the cooling masses, applying humor as the sugar to help the medicine go down. these have been published recently by the asme press in a book called, "hot air rises and heat sinks: everything you know about cooling electronics is wrong." it is available at https://www.amazon.com/hot-air-rises-heat-sinks/dp/0791800741. this advice column is an extension of that educational effort.