<<ListTagged 0>>
Welcome to the home page for the 2008 //Waterloo Workshop on Advanced Quantum Theory// (''WWAQT'')!\n\nThe 2008 WWAQT will be held November 18-19, at the [[Perimeter Institute|http://perimeterinstitute.ca]]. The highlight of the conference will be a series of short presentations on a wide range of topics in quantum theory. Talks will be held in the Bob Room (4th floor -- take the elevator up from the lobby) between 6:00 PM until 8:10 PM each evening (see schedule below). Participants are advised to arrive at Perimeter between 5:50 PM and 6:10 PM -- the Institute's doors are locked for the evening, but security personnel will provide access during those times.\n\nParticipants giving WWAQT talks are //required// to attend both sessions, unless by prior arrangement with the [[Session Chair|Robin Blume-Kohout]]. Students enrolled in AM473/PHYS454 who are //not// giving talks are required to attend at least one session, and are encouraged to attend both. Other interested parties are encouraged to attend.\n\nTalks will be held to a strict limit of 15 minutes, with 5 minutes afterward for questions. An LCD projector with a standard VGA connector will be provided, as will extensive blackboard space. Those planning to use the projector are strongly advised to test out their laptop with a projector ahead of time -- and those with Mac laptops should definitely make sure to bring a VGA adapter. Presenters wishing to give overhead transparency talks should make arrangements with the [[Session Chair|Robin Blume-Kohout]] ahead of time! Presenters who do not have access to a laptop can bring a PDF file of their slides on a USB key, and a laptop will be made available.\n\n! Tuesday, November 18, 2008\n\n6:10 PM ''Nikolina Ilic'': "The Ekert Protocol"\n\n''Abstract'': The Ekert protocol <http://prola.aps.org/abstract/PRL/v67/i6/p661_1> is described. The violation of Bell's theorem that results from measuring states described by Ekert is explained. Error correction and privacy amplification techniques that are associated with standard protocols are briefly discussed.\n----\n6:35 PM ''Kael Dixon'': "Quantum Computation as Geometry" by Nielsen, Dowling, Gu, and Doherty.\n\n''Abstract'': Quantum circuits are unitary operators on some finite dimensional Hilbert space constructed by combining unitary operators from a finite set of "quantum gates". Geometrically, this can be viewed as a path from the origin to the desired unitary operator in the manifold of such unitary operators. With an appropriate choice of metric, the length of such a path corresponds to the number of quantum gates required to construct a quantum circuit to perform the desired unitary operation. The discussed paper shows that quantum circuits can be constructed that approximate the path minimizing the length to a reasonable amount of accuracy. This presentation will briefly introduce the ideas of quantum circuits and the universality of quantum gates, and then discuss the results of the paper. //Reference:// <http://arxiv.org/abs/quant-ph/0603161>.\n-----\n7:00 PM ''Mukto Akash'': "Yet Another Derivation - From Information Geometry to Quantum Theory"\n\n''Abstract'': In this talk I will present a new approach to quantum formalism through information geometry. Guided by a simple, yet elegant, development of quantum theory by Philip Goyal, I shall argue that we may be able to avoid much of the mathematical abstractness, for example use of complex number system to describe the state space, that is present in the current formalism, as well as to bring out some physical intuition to the theory of quantum mechanics. I shall introduce an information geometric framework on classical probability space. Thereon I shall show how the employment of three elementary quantum phenomenon - namely complementarity, measurement simulability, and global gauge invariance - as postulates of the theory enables us to reconstruct the finite-dimensional quantum formalism. //Reference:// <http://arxiv.org/abs/0805.2770>\n-----\n7:25 PM ''Jenna Voisin'': "Quantum Games"\n\n''Abstract'': My presentation will focus on quantum games, and how the strategies involved in playing them can be applied to quantum algorithms, which can be shown in cases to be more efficient than classical algorithms. The discussion will focus on the example of the "Penny Flip" game, and will explicitly show how quantum strategies outperform classical ones in this case. The paper I am presenting is <http://arxiv.org/abs/quant-ph/9804010>, entitled "Quantum Strategies" by David A. Meyer.\n-----\n7:50 PM ''Rob Blom'': "Degrees of Freedom for Black Hole Entropy"\n\n''Abstract'': I will be discussing how the Area Law applies to the Degrees of\nFreedom with respect to the horizons of black holes. How the area law\nis a generic feature of entanglement. What exactly black hole entropy\nmeans and what a horizon means in the context of a black hole. Lastly,\ndiscussing the transition between the Ground State and Excited States\n(ES) such that the DOF away from the horizon contribute more to the\ntotal entropy (in ES); and thus deviate away from the Area Law.\nThereby concluding that the horizon degrees of freedom are responsible\nfor the area law. //Reference:// <http://arxiv.org/abs/gr-qc/0703082>\n\n! Wednesday, November 19\n\n6:10 PM ''Sonia Markes'': "Making Quantum Theory Reasonable"\n\n''Abstract'': In this presentation, I will explain Lucien Hardy's five axioms from which quantum theory can be derived. I will show that quantum theory could have been discovered without reference to phenomenology. The main paper this is based off of <http://arxiv.org/abs/quant-ph/0111068>, but the technical details are found in <http://arxiv.org/abs/quant-ph/0101012> so to some extent, I will cover both papers.\n----\n6:35 PM ''Matthew Badali'': "How A Chemist Does Quantum Mechanics"\n\n''Abstract'': There is an entity in quantum mechanics called the correlation function. It can be treated in different manners depending on the outcome that is desired to be computed. An common practice is to approach the quantum problem as a classical problem and add in the quantum mechanical effects. A popular method for doing this is called the Semi-Classical Initial Value Representation (SC-IVR). This process and the mathematical justifications for it (ie. where/how the quantum comes in) will be presented for my project, following the example given in the article as a particular application. Furthermore, the computational costs associated with this method grow very rapidly as the number of bodies in the system increases, so a novel technique for reducing the complexity of the calculation is included in my presentation. Briefly, instead of integrating over the phase space of initial values (realistically, taking a Monte Carlo approach) as is usually done in SC-IVR, it is conjected that only a few points are required, if the points are integrated over their trajectories in phase space to give time-averaged values. Again, examples will be presented as per the paper selected. //References//: <http://scitation.aip.org/getabs/servlet/GetabsServlet?prog=normal&id=JCPSA6000127000014144306000001&idtype=cvips&gifs=yes>. Kaledin and Miller, Journal of Chemical Physics, 118, 16 (2003); Issack and Roy, Journal of Chemical Physics, 127, 144306 (2007); Miller, Journal of Physical Chemistry, 105, 2942 (2001).\n----\n7:00 PM ''Jordan Lapointe'': "Solid State Implementation of Quantum Random Walks on General Graphs"\n\n''Abstract'': Some brief background information will be given on random walks (both classical and quantum), quantum dots, and quantum gates. The authors' theoretical and practical approach to implementing quantum random walks with a grid of quantum dots will then be discussed. Some discussion of the authors' numerical simulation will also be given, as well as an overview of the significance of the authors' results. //Reference//: <http://arxiv.org/abs/0811.1795>\n----\n7:25 PM ''Jasper Brawley-Hayes'': Quantum Information Theory: Can non-private channels transmit quantum information?\n\n''Abstract'': A recent discovery about quantum channel capacity has suggested that the classical notions of information theory are insufficient as a complete model. Graeme Smith and Jon Yard have successfully shown that two zero capacity quantum channels can be combined in a way that allows for positive quantum capacity, so long as one of the channels has some positive private capacity <http://arxiv.org/abs/0807.4935>. This is in violation of a property of classical channel capacity called additivity, where additive channels used in parallel have capacity equal to the sum of the capacities of the individual channels. Later, Graeme Smith teamed up with John A. Smolin to investigate the capacity of non-private channels. Smith and Smolin have found considerable evidence that even channels with very small private capacity (<ฮต) can be combined with those with zero private capacity to form positive noiseless quantum capacity <http://arxiv.org/abs/0810.0276>.\n
Applied Math 473, also listed as Physics 454 and Applied Math 673, is a senior-level course in advanced quantum theory. In 2008, it was taught by [[Robin Blume-Kohout]], and TAed by [[Chris Ferrie]].\n\nThe course is titled "Advanced Quantum Theory", but a fair subtitle (at least this year!) might be "What the heck is going on in quantum theory?" Our goal is to obtain a coherent picture of how quantum theory fits into physics -- and, in particular, where it parallels and diverges from classical theories -- while paying roughly equal attention to (a) canonical practical problems, and (b) foundational and conceptual issues. So, to pick an instance at semi-random, we'll spend some time on the theory of quantum angular momentum (which is pretty much required material, due to its importance for atomic physics and particle physics), but then we'll slide over and take a look at SU(2) phase space, a decidedly noncanonical diversion that helps connect //quantum// angular momentum with the theory of its //classical// counterpart.\n\nAnd now, on to the course details.\n\n* Logistic, policies, and technical details:\n** The [[Syllabus]] is a good place to start.\n** You ought to read the [[Homework Policy]] before you start the [[Homework assignments]].\n** At some point you might find the [[Homework Solutions]] useful.\n** The [[Project]] is a significant part of the course.\n** Naturally, there are some [[Frequently Asked Questions|FAQ]].\n* Content\n** The course [[Schedule]] is pretty central. It includes links to [[reading|Quantum Theory]] and [[homework|Homework assignments]] assignments.\n* Historical details\n** Here's a [[Complete history of changes]] to the course website during the Fall 2008 term.\n** The [[1st Annual Waterloo Advanced Quantum Theory Workshop]] was held in November.\n** [[The Journal of Advanced Quantum Theory]] published its first issue in Fall 2008.
This hasn't been around long enough for any questions asked to be frequent, but I'll try to anticipate some.\n\n!!Who?\nThe site is principally authored by me, [[Robin Blume-Kohout]]. I may open it up for collaboration in the future.\n\n!!What?\nIt's partly a course website. It's partly a textbook in quantum mechanics. It's partly a collection of useful notes and facts about the //math// of quantum theory. And, as Garrett originally said: It's open source physics.\n\n!!Why?\nI'm trying to figure out the best way to present and navigate theoretical physics. [[Semantic Networks|http://www.jfsowa.com/pubs/semnet.htm]] provide a natural structure for relating abstract conceptual information, and a wiki is a good practical equivalent. It allows a reader to quickly learn new concepts in digestible pieces, and trace forwards or backwards to the implications or foundations of those concepts — while allowing an author, or authors, to easily expand the content. I am very impressed with the way [[Wikipedia|http://en.wikipedia.org/wiki/Main_Page]] works and evolves, and this is my own personal version, modified for pedagogical purposes.\n\n!!How?\nThe two main pieces from which this site is built are [[TiddlyWiki|http://www.tiddlywiki.com/]], created by Jeremy Ruston, and [[jsMath|http://www.math.union.edu/~dpvc/jsMath/]], made by Davide P. Cervone. Garrett Lisi originally put this notebook together, and I've excavated most of his content and replaced it with my own. Both of these excellent open source software packages are under continuing development and have supportive communities. I owe thanks to many people for building the pieces used for this site, and for helping out with technical details. To delve more into the nitty-gritty of how it's put together, and see who contributed which plugins, go check out the [[Configuration]]. If things don't work perfectly, it's probably my fault, or possibly Garrett's. You can get everything you need to set up a similar wiki for yourself from this [[downloads directory|http://deferentialgeometry.org/download/]].\n\nTo use it... try things. Click on buttons and see what happens, you'll figure it out.\n\n!!Where\nThis site is served from a closet in England. Why? Er... globalization? And the fact that my best friend from undergraduate (a) went to grad school in England, (b) still has a server there, and (c) happened to be online when I needed a server NOW.\n\n!!When\nNow. There will be an exam.\n\n----\n<<slider chkSliderAbout 'About (slider)' 'More questions and answers ยป' 'Click to see more questions and answers'>>\n
Nothing to see here right now...
Quantum theory stands as the most startling development of the 20th century in physics. Both quantum theory and relativity altered our view of the physical universe forever -- but quantum theory challenges the very notion of reality. It changed our view not just of physics, but of science itself -- of the framework that we use to reason about our experience. After a century of investigation, we know a tremendous amount about how quantum theory works, what it predicts, and how to confirm its predictions with experiments. What we //don't// understand, even now, is how to make complete sense of it. Rather disturbingly, the interpretation of quantum theory -- what it //means//, and how to gain intuition for its predictions -- remains a matter of opinion, of personal choice, among physicists. That such a crucial conceptual question remains unclear is disturbing. That it remains so //despite// our fantastic success at using and analyzing the theory to model the physical world is unbelievable.\n\nThe Journal of Advanced Quantum Theory is dedicated to the long struggle between humanity and ignorance -- specifically, ignorance of quantum theory. The Editors of JAQT want more people to understand more about quantum theory. We seek clarity, intuition, and rigor. Our audience comprises everyone who wishes to understand more about the problems that quantum theory solves, and also those that it creates. Where clear and unproblematic solutions to important questions exist, we want this knowledge to be efficiently available. Where thorny and unreconciled problems persist, we want clear statements of them and their consequences. Most of all, we believe that a clear distinction between these categories -- between __solved__ problems and __unsolved__ problems -- is a powerful ingredient in the continuing struggle to figure out what the heck is going on with quantum theory.\n\nThe Journal of Advanced Quantum Theory solicits clearly written expository articles on important and neglected topics within the canon of quantum theory, including (but not limited to) quantum mechanics, quantum information, quantum computation, and the mathematical foundations of quantum theory. The Journal also seeks clearly-written articles on exploratory research designed to explicate, demonstrate, and clarify the principles of quantum theory. While the Journal is not intended as a forum for cutting-edge original research, articles documenting such research are welcomed if they simultaneously address the Journal's core goal of enhancing its readership's understanding of quantum theory's core principles. With this goal in mind, we ask that submissions be written at the technical level of an advanced undergraduate or beginning graduate math/physics student.
\n''Acknowledgments'': I'm forever indebted to [[Garrett Lisi|http://deferentialgeometry.org]], who put together this wiki package based on [[TiddlyWiki|http://www.tiddlywiki.org]], and generously shares it with the world. If you find some orphaned remnants of Garrett's research hidden away, don't be too shocked. I'm also grateful to [[Matthew Jadud|http://www.jadud.com/MCJ.html]], professor of computer science at Allegheny College, for helping me to host this resource. [[Michael Nielsen|http://michaelnielsen.org/blog/?page_id=181]] has consistently inspired and encouraged me. Roughly a hundred colleagues, friends, and teachers have helped me to learn enough about quantum theory to write this resource. [[Joseph Emerson|http://www.iqc.ca/people/person.php?id=34]], who at last check "still does not understand why quantum mechanics is so weird", gave me the opportunity to teach [[AM473]]. And last but very definitely not least, I am grateful to my students from AM473 and PHYS454 for their patience, enthusiasm, and constructive criticism -- Abhineet Agarwal, Mukto Akash, Matthew Badali, Anton Baglaenko, Rob Blom, Jasper Brawley-Hayes, Nathan Czarny, Kael Dixon, Daniel Fiori, Douglas Friesen, Tyler Holden, Chantal Hutchison, Durand Jarrett-Amor, Anthony Lausch, Yizhi Li, Kevin Liu, Sonia Markes, James Racek, Robert Schaffer, Maximilien Schirm, Ryan Speller, Kyle Thompson, and Jenna Voisin. May you continue to explore the dragon-filled areas.\n\n''Logistical Details'': I won't try to tell you all about TiddlyWiki, nor about the jsMath package that enables LaTeX in these pages, but here is a very brief guide:\n# Links to other pages in this wiki will open up a new "tiddler" in your browser. Move your mouse button to the upper right corner of a tiddler to enable several buttons (mouseover for what they do), one of which will allow you to ''close'' that tiddler.\n# This whole wiki is running in your browser, on your computer. This means you can't actually change anything on the server!\n# Feel free to try, though. If you double click on anything, it will open up an editor that will let you edit the wikitext! Remember, you're only editing a local copy, so mangle it all you want... and then just hit "reload" in your browser to get the clean copy from the server. Please avoid reloading too much -- the whole wiki is about a megabyte, which is kinda hard on bandwidth.\n# Italicized links are to pages that don't exist yet. Consider them officially "under construction".\n# There's a lot of math in here, and there's still no standard solution for math on the web. I use [[jsMath|http://www.math.union.edu/~dpvc/jsMath/welcome.html]] to display math. Your browsing experience will be __seriously__ influenced by your choice of browser! Internet Explorer is a ''bad'' choice, and Firefox is generally a ''good'' choice. If you're using IE, pages may take a very very long time to load.\n# You don't need to install anything to view the pages, but the math will look much nicer if you install half a dozen [[jsMath fonts|http://www.math.union.edu/~dpvc/jsMath/download/jsMath-fonts.html]]. It's really easy!
<<option chkGenerateAnRssFeed>> generate an RSS feed\n<<option chkOpenInNewWindow>> open links In a new window\n<<option chkSaveEmptyTemplate>> save empty template\n<<option chkToggleLinks>> clicking on links to notes that are already open causes them to close\n^^(override with control or other modifier key)^^\n<<option chkHttpReadOnly>> hide editing features when viewed over HTTP\n<<option chkForceMinorUpdate>> treat edits as minor changes by preserving date and time\n^^(override with shift key when clicking 'done' or by pressing ctrl-shift-enter^^\n<<option chkConfirmDelete>> confirm before deleting\nmaximum number of lines in a note edit box: <<option txtMaxEditRows>>\n<<option chkSaveBackups>> save backups\n<<option chkAutoSave>> auto save\nfolder name for backup files: <<option txtBackupFolder>>\n<<option chkInsertTabs>> use tab key to insert tab characters instead of jumping to next field\n<<option chkDisableExcept>> show hidden system tags\n!These change how the [[UploadPlugin]] works\nUrl of the UploadService script^^(1)^^: <<option txtUploadStoreUrl 50>>\nRelative Directory where to store the file^^(2)^^: <<option txtUploadDir 50>>\nFilename of the uploaded file^^(3)^^: <<option txtUploadFilename 40>>\nDirectory to backup file on webserver^^(4)^^: <<option txtUploadBackupDir>>\n\n^^(1)^^Mandatory either in UploadOptions or in macro parameter\n^^(2)^^If empty stores in the script directory\n^^(3)^^If empty takes the actual filename\n^^(4)^^If empty existing file with same name on webserver will be overwritten
/***\nname: AllTagsExceptPlugin\nauthor: Garrett\nversion: 0.1.0\nThis is a revision of Clint Checketts' allTagsExcept plugin, which lists all tags except those listed.\n\n<<option chkDisableExcept>> show hidden system tags\n\n!!Usage\n{{{\n<<AllTagsExcept tag1 tag2 ...>>\n}}}\n!!!Code\n***/\n/*{{{*/\nversion.extensions.AllTagsExcept = {major: 0, minor: 1, revision: 0};\n\nif (!config.options.chkDisableExcept) config.options.chkDisableExcept=false; // default to standard action\n\nconfig.macros.AllTagsExcept = {tooltip: "Show notes tagged with '%0'",noTags: "There are no tags to display"};\n\nconfig.macros.AllTagsExcept.handler = function(place,macroName,params)\n{\n var tags = store.getTags();\n var theDateList = createTiddlyElement(place,"ul");\n if(tags.length == 0)\n createTiddlyElement(theDateList,"li",null,"listTitle",this.noTags);\n for(var t=0; t<tags.length; t++)\n {\n var includeTag = true;\n for (var p=0;p<params.length; p++) if ((tags[t][0] == params[p])&&(!config.options.chkDisableExcept)) includeTag = false;\n if (includeTag)\n {\n var theListItem =createTiddlyElement(theDateList,"li");\n var theTag = createTiddlyButton(theListItem,tags[t][0] + " (" + tags[t][1] + ")",this.tooltip.format([tags[t][0]]),onClickTag);\n theTag.setAttribute("tag",tags[t][0]);\n }\n }\n}\n/*}}}*/\n
//Note: As of January 1, 2009, the Journal of Advanced Quantum Theory is not currently accepting submissions. Authors are welcome to submit articles, but they will be filed and considered in turn when the Journal resumes publication.//\n\nThe Journal of Advanced Quantum Theory solicits clearly written [[expository articles|Expository article]] on important and neglected topics within the canon of quantum theory, including (but not limited to) quantum mechanics, quantum information, quantum computation, and the mathematical foundations of quantum theory. Some examples can be found [[here|Good Paper Topics]]. The Journal also seeks clearly-written articles on [[exploratory research|Research project]] designed to explicate, demonstrate, and clarify the principles of quantum theory. While the Journal is not intended as a forum for cutting-edge original research, articles documenting such research are welcomed if they simultaneously address the Journal's core goal of enhancing its readership's understanding of quantum theory's core principles. \n\nWith this goal in mind, we ask that submissions be written at the technical level of an advanced undergraduate or beginning graduate math/physics student. The Journal accepts both traditional papers (in PDF format), and online Wikipedia articles. PDF articles should be typeset and formatted in line with the guidelines for [[Physical Review Letters|http://prl.aps.org/info/authors.html]], with a final length of 3-5 pages. We //strongly// recommend using [[LaTeX|http://www.latex-project.org/]], with the [[REVTeX|http://authors.aps.org/revtex4/]] document class, to prepare and typeset articles. LaTeX installations and tools are available for every commonly used operating system.\n\nA Wikipedia "article" can be a single page, or a cluster of closely related pages. Traditional articles must be the sole work of the author[s], but for Wikipedia articles we recognize that strict control over authorship is neither possible nor desirable. Nonetheless, the Editors require that the author[s] submitting a Wikipedia article be prepared to demonstrate that a solid majority of the article's content (at the time of submission) is their own -- in short, that the author[s] "deserve credit" for the article. Authors are encouraged to maintain their article after its acceptance into the Journal, but are not responsible for doing so. Finally, the Journal requests that all author[s] on submissions with multiple authors provide a brief statement detailing what //they//, individually, contributed to the work.\n\nPlease submit articles via email to <prof@am473.ca>. Traditional articles must be submitted as PDF attachments. Wikipedia articles should be submitted by providing a ''permalink'' to a particular revision of the page, or separate permalinks for each page in a cluster. E.g., <http://en.wikipedia.org/w/index.php?title=Quantum_theory&oldid=273181654>, rather than <http://en.wikipedia.org/wiki/Quantum_theory>. The submission email may, at the author's option, include a brief note explaining the submission -- but submissions will generally be assumed to speak for themselves.
Bayes' Rule, or more properly //Bayes' Theorem//, connects the conditional probability of an event $E$ (i.e., its probability given that some other event $F$ //has// happened) with the other event's conditional probability (i.e., the probability that $F$ would have happened if $E$ had happened). It's usually used to //update// a prior probability distribution (i.e., the state of a system //before// a measurement) into a posterior probability distribution (its state //after// the measurement).\n\nBayes' Rule is:\n\sbegin{equation}\np(j | I_k) = p(j) \scdot p(I_k | j ) \scdot \sfrac{1}{\ssum_{j'}{p(j')p(I_k|j')}}\n\send{equation}\n\nTo derive it, we start from the definition of conditional probability. The probability of event $A$ given event $B$ is\n\sbegin{equation}\nP(A|B)=\sfrac{P(A \scap B)}{P(B)}.\n\send{equation}\nEquivalently, the probability of event $B$ given event $A$ is\n\sbegin{equation}\nP(B|A) = \sfrac{P(A \scap B)}{P(A)}. \s!\n\send{equation}\nRearranging and combining these two equations, we find\n\sbegin{equation}\nP(A|B)\s, P(B) = P(A \scap B) = P(B|A)\s, P(A). \s!\n\send{equation}\nThis lemma is sometimes called the product rule for probabilities. Dividing both sides by $P(B)$, providing that it is non-zero, we obtain Bayes' Rule:\n\sbegin{equation}\nP(A|B) = \sfrac{P(A \scap B)}{P(B)} = \sfrac{P(B|A)\s,P(A)}{P(B)}. \s! \n\send{equation}\n\nBayes' Rule itself is noncontroversial, but the settings in which it should be used are a matter of much debate between the two schools of probability theory: [[Frequentist|frequentist]] and [[Bayesian]]. Generally, Bayesians hold that it is always appropriate to describe a situation of uncertainty with a probability distribution -- because, to them, probability distributions are precisely betting odds, and therefore can be assigned subjectively -- whereas frequentists only use probability when there exists, at least in principle, an infinite ensemble of systems to define frequencies of observation in the limit of infinitely many measurements. As a result, Bayesians like to do statistical estimation by starting with a prior probability distribution over the unknown parameter and updating it using Bayes' Rule -- while frequentists abhor this.
//Context:// [[Lecture 10]]\n[>img(50%,auto)[images/BlochSphere.png]]\n\nThe Bloch sphere is a solid sphere in $\sreals^3$, whose points are in 1:1 correspondence with the possible states for a quantum system with a 2-dimensional Hilbert space $\smathcal{H}_2 = \scomplex^2$ -- e.g., a photon's polarization, or the spin of a spin-$\sfr12$ particle, or a qubit. The points in the sphere are //Bloch vectors// in $\sreals^3$, and they are linear representations of density matrices. The components $x,y,z$ of the Bloch vector are the expectation values of the [[spin operators|Pauli matrices]] $\ssigma_x,\ssigma_y,\ssigma_z$.\n\n!!! Deriving the Bloch sphere\n\nConsider an arbitrary pure state $\sket\spsi$ for a quantum system in the 2-dimensional Hilbert space $\smathcal{H}_2$. If this state is represented in some basis, as\n$$\sket\spsi = \smat{ \salpha~ \s\s \sbeta~ },$$\nthen it appears to have 2 complex parameters, and therefore 4 real parameters. However, normalization fixes one ($|\salpha|^2+|\sbeta|^2=1$), and we must mod out the global phase -- i.e., identify $e^{i\stheta}\sket\spsi$ and $\sket\spsi$ as the same state. The global phase can be fixed by demanding that $\salpha$ be real, which eliminates another parameter. Thus, there are only 2 real parameters defining this quantum state.\n\nLet us represent this state $\sket\spsi$ by the three expectation values $\s{\sexpect{\ssigma_x},\sexpect{\ssigma_y},\sexpect{\ssigma_z}\s}$. There are three of them, so we might expect them to overdetermine the state -- and indeed they do, but in a very nice way. Let us treat them as a vector in 3-dimensional Euclidean space:\n$$\svec{r} = \smat{ X \s\s Y \s\s Z } = \smat{ \sexpect{\ssigma_x} \s\s \sexpect{\ssigma_x} \s\s \sexpect{\ssigma_x} }$$\nConveniently, this vector is always a //unit// vector -- it lies on the surface of a sphere of radius 1. To see this, we can calculate\n\sbegin{eqnarray}\n\sexpect{\ssigma_x} &=& \sbeta^*\salpha + \salpha^*\sbeta = 2\sRe(\sbeta^*\salpha) \srightarrow \sexpect{\ssigma_x}^2 = (\sbeta^*\salpha)^2 + (\salpha^*\sbeta)^2 + 2|\salpha|^2|\sbeta|^2 \s\s\n\sexpect{\ssigma_y} &=& i\sbeta^*\salpha - i\salpha^*\sbeta = 2\sIm(\sbeta^*\salpha) \srightarrow \sexpect{\ssigma_y}^2 = -(\sbeta^*\salpha)^2 - (\salpha^*\sbeta)^2 + 2|\salpha|^2|\sbeta|^2 \s\s\n\sexpect{\ssigma_z} &=& |\salpha|^2 - |\sbeta|^2 \srightarrow \sexpect{z}^2 = |\salpha|^4 - 2|\salpha|^2|\sbeta|^2 + |\sbeta|^4\n\send{eqnarray}\nand then, adding up their squares, we find\n$$\sexpect{\ssigma_x}^2 + \sexpect{\ssigma_y}^2 + \sexpect{\ssigma_z}^2 = |\salpha|^4 + 2|\salpha|^2|\sbeta|^2 + |\sbeta|^4 = \sleft(|\salpha|^2 + |\sbeta|^2\sright)^2 = (1)^2 = 1.$$\n\nAs we'll prove in a moment, not only does every $\sket\spsi$ yield a point on the sphere, but every point on the sphere (i.e., every normalized $\svec{r}$ vector) corresponds to a valid pure state! A bit of algebra shows that the six states $\s{\sket\suparrow,\sket\sdownarrow,\sket\srightarrow,\sket\sleftarrow,\sket\sinarrow,\sket\soutarrow\s}$ correspond to points at the six "poles" -- North, South, East, West, In, and Out -- of the sphere. This is very nice -- it retroactively justifies the notation we've used for them.\n\nBut there's more to it than that: each of those is an eigenstate of $J_z$, $J_x$, or $J_y$ respectively. The eigenstates of $J_z$ are represented by vectors pointing in the $\spm\shat{z}$ direction, and the same goes for the eigenstates of the other angular momentum operators. In fact (this is cool!) for any unit vector $\shat{n}$, the eigenstates of\n$$\svec{J}\scdot\shat{n} \sequiv n_xJ_x + n_yJ_y + n_zJ_z$$\nlie along the $\spm\shat{n}$ axes! Proving this requires only a bit of (slightly tedious) algebra.\n\n!!! Spherical parametrization of pure states\n\nNow that we know about the Bloch sphere, here is a nice way to parametrize a pure state: \n$$\sket\spsi = \smat{ \scos(\stheta/2)~ \s\s e^{i\sphi}\ssin(\stheta/2)~ }\s,$$\nWhy is this nice? Well, first of all, it explicitly parametrizes //all// pure states using just two real parameters (as I suggested above should be possible). What's really nice, though, is what happens when we write out the projector onto this state,\n\sbegin{eqnarray}\n\sproj{\spsi} &=& \smat{ \scos^2(\stheta/2)~ & e^{-i\sphi}\ssin(\stheta/2)\scos(\stheta/2)~ \s\s e^{i\sphi}\ssin(\stheta/2)\scos(\stheta/2)~ & \ssin^2(\stheta/2)~ } \s\s\n&=& \smat{ \scos^2(\stheta/2)~ & \sfrac12e^{-i\sphi}\ssin\stheta~ \s\s e^{i\sphi}\ssin\stheta~ & \ssin^2(\stheta/2)~ },\n\send{eqnarray}\nand calculate the $\svec{r}$ vector of expectation values:\n\sbegin{eqnarray}\nZ &=& \sexpect{\ssigma_z} = \scos^2(\stheta/2) - \ssin^2(\stheta/2) = \scos\stheta,\nX &=& \sexpect{\ssigma_x} = \sfrac12\ssin\stheta\sleft(e^{-i\sphi} + e^{i\sphi}\sright) = \ssin\stheta\scos\sphi,\nY &=& \sexpect{\ssigma_y} = \sfrac{-i}{2}\ssin\stheta\sleft(e^{i\sphi} - e^{-i\sphi}\sright) = \ssin\stheta\ssin\sphi.\n\send{eqnarray}\nThese are just the spherical coordinates that parametrize an arbitrary point on the unit sphere. So for any point on the unit sphere, we can write down $\stheta$ and $\sphi$, and then construct a quantum pure state corresponding to that point! Thus, points on the surface of the Bloch sphere are in 1:1 correspondence with pure states.\n\n!!! The geometry of the Bloch sphere\n\nNow, a few words on the geometry of the Bloch sphere. First, you should //definitely// make note of the fact that orthogonality in $\sreals^3$ is not the same as orthogonality in Hilbert space! For instance, the states $\sket\suparrow$ and $\sket\sdownarrow$ are orthogonal, but they lie at opposite poles of the Bloch spheres -- their Bloch vectors are //antiparallel//. So, in fact, are the Bloch vectors corresponding to the pairs $\sket\srightarrow,\sket\sleftarrow$ and $\sket\sinarrow,\sket\soutarrow$. This is a general fact: pairs of orthogonal states are represented by //antipodal// points on the Bloch sphere (or anti-parallel Bloch vectors, if you prefer). Bloch vectors separated by 90 degrees -- e.g., $\sket\suparrow,\sket\srightarrow$ -- are not orthogonal.\n\n!!! Digression on $SO(3)$ and $SU(2)$\n\nYou might be tempted to wonder //why// there is this very nice correspondence between angular momentum along a particular direction, and quantum states in Hilbert space. In particular, does it always hold? -- e.g., for systems with angular momentum greater than $\sfr{\shbar}{2}$? The answer is: no and yes. The Bloch sphere is a unique feature of 2-dimensional Hilbert spaces, and systems with greater total angular momentum have larger Hilbert spaces. Thus, their states do not form a nice geometric shape like this. //However//, there is something very deep about the Bloch sphere, and it comes from the group theory behind the physics.\n\nThe symmetry group -- i.e., the group of reversible transformations -- of a classical spinning particle is the group $SO(3)$, consisting of all $3\stimes3$ "//orthogonal//" rotation matrices. This is a fairly intuitive group -- it's just all the ways you could rotate a rigid body in $\sreals^3$! Anyway, the group of reversible transformations on a 2-dimensional quantum system is $SU(2)$, consisting of all $2\stimes 2$ unitary matrices (well, technically just the ones with determinant 1 -- try Wikipedia for the distinction between $SU(2)$ and $U(2)$). Why do I mention this? Because it turns out that $SU(2)$ is darn near isomorphic to $SO(3)$! In fact, it's what's called a //double cover// -- locally, they look indistinguishable, but you go 'round twice in $SU(2)$ to get back to your starting point. Sort of like a Moebius strip looks very much like a ring of paper, except for its global properties (the twist). Anyway, this is what makes a 2-dimensional quantum system look so much like a classical spinning particle. Quantum angular momentum systems with larger $J$ are also governed by $SU(2)$, but they form different //representations// of the abstract group $SU(2)$... see [[Lecture 25]] and [[Lecture 26]].\n\nI also mentioned before that photons are 2-state quantum systems. They can also be described profitably with the Bloch sphere, but the correspondence to physical space is a little warped. Now the $\sket{H}$ state (horizontal polarization along the $\shat{x}$ axis) is at the top of the sphere, the $\sket{V}$ state (vertical polarization along the $\shat{y}$ axis) is at the bottom, and the other states are $\sket{D}$ and $\sket{A}$ (diagonal and antidiagonal polarization) and $\sket{L}$ and $\sket{R}$ (left and right circular polarization). The Bloch sphere works fine, but as you can see there's no real physical intuition to be gained by looking at the directions in which various Bloch vectors point.\n
//Context:// [[Lecture 4]] and [[Lecture 5]]\n\nBorn's Rule is an axiom of quantum theory: it states that if a quantum system is prepared in the state $\sket{\spsi}$, and a measurement with an outcome $j$ corresponding to $\sbra{\sphi_j}$ is made, then the probability of observing outcome $j$ is\n$$p(j|\spsi) = |\sbraket{\sphi_j}{\spsi}|^2.$$\nThis can be usefully rewritten using the [[cyclic property of the trace]] as\n$$p(j|\spsi) = \sbraket{\sphi_j}{\spsi}\s!\sbraket{\spsi}{\sphi_j} = \sTr\sleft[\sproj{\sphi_j}\sproj{\spsi_j}\sright]$$\nwhich is bilinear in $\sproj{\sphi_j}$ and $\sproj{\spsi}$. If we take $\sproj{\sphi_j}$ to represent the measurement outcome, and $\sproj{\spsi}$ to represent the state, then we can generalize the theory to include (i) outcomes $\shat{E}_j$ of [[POVM]] measurements and (ii) mixed quantum states $\shat{\srho}$, both represented as operators:\n\sbegin{eqnarray}\n\sket{\spsi} &\slongrightarrow& \sproj{\spsi} \slongrightarrow \shat{\srho} \s\s\n\sbra{\sphi_j} &\slongrightarrow& \sproj{\sphi_j} \slongrightarrow \shat{E}_k.\n\send{eqnarray}\nIn this more powerful formalism, Born's Rule generalizes to\n$$p(j|\srho) = \sTr\sleft[ \shat{E}_k \shat{\srho} \sright]$$\n\n!!! Some important conceptual and historical notes\n\n# In this theory, states and measurement outcomes are associated with different mathematical objects. States are column vectors ($\svec\sphi$), measurement outcomes are row vectors ($\svec\spsi_j^\sdagger$). It is a rather deep fact (theorem, actually), that we can flip back and forth among them -- i.e., that they are //isomorphic// -- via the $\sdagger$ operation. To put it in rigorous mathematical language, once we have picked our Hilbert space $\smathcal{H}$, the states are vectors in $\smathcal{H}$ and the measurement outcomes are //dual vectors//, elements of the [[dual space]] $\smathcal{H}^*$. Only by combining one of each (think Ying and Yang here) can we obtain a scalar amplitude $z_j$. This is quite similar to the structure of classical probability theory, wherein we got probabilities by combining a state and an indicator function.\n\n# You might be wondering why the probability is the //square// of the amplitude $z_j$. After all, while it has to be real, we could just as easily have declared that it was $|z_j|$, right? or even $|z_j|^4$ or something. The answer is ''we don't actually know''! When Max Born wrote down his eponymous rule in a paper in 1926, he actually got it wrong. He thought it was $p_j = |\svec\spsi_j^\sdagger\svec\sphi|$. The correct formula appears in a last-minute footnote, where he basically says "Actually, as this was going to proofs I realized it was the inner product //squared//!" You probably shouldn't spend too much time worrying about this, because: (1) experimental evidence, which is always the last word in physics, absolutely confirms Born's [modified] Rule; and (2) there are some pretty darn good mathematical reasons why it ought to be $|z_j|^2$ instead of $|z_j|^p$ for some other $p$. But you should be aware that people have been arguing about how to prove or derive Born's Rule for the past 50 years... see, for instance, http://arxiv.org/abs/quant-ph/0405161, which is still stirring up debate.\n
Creating bulleted lists is simple.\n* Just add an asterisk\n* at the beginning of a line.\n** If you want to create sub-bullets\n** start the line with two asterisks\n*** And if you want yet another level\n*** use three asterisks\n* You can also do [[Numbered Lists]]\n{{{\nCreating bulleted lists is simple.\n* Just add an asterisk\n* at the beginning of a line.\n** If you want to create sub-bullets\n** start the line with two asterisks\n*** And if you want yet another level\n*** use three asterisks\n* You can also do [[Numbered Lists]]\n}}}
<html><div align="center">\n<iframe src="http://www.google.com/calendar/embed?src=aeqa6f611unem7gg91qs33tqgg%40group.calendar.google.com&ctz=America/New_York" style="border: 0" width="800" height="600" frameborder="0" scrolling="no"></iframe>\n</div></html>
Roughly speaking, a sequence is Cauchy if and only if it is convergent. To put it another way, "Cauchyness" is the most commonly used definition of convergence for sequences.\n\nA sequence of points $\s{x_1,x_2,x_3,\sldots\s}$ in a metric space with metric $d(x,y)$ is Cauchy if and only if for every real $\sepsilon>0$, there exists some $N$ such that for every $m,n>N$,\n$$d(x_m,x_n) < \sepsilon.$$\nThis means that for every tiny $\sepsilon$, there is some point ($N$) beyond which all the elements of the sequence lie within a ball of radius $\sepsilon$.
Chris is the TA for AM473/PHYS454 in the Fall '08 term.\n\nChris's office is MC 6091H, where he has office hours from 12:30-1:20 PM on Thursdays. He can be reached by email at [[ta@am473.ca|mailto:ta@am473.ca]].\n
/***\nAuthors: Eric Shulman & Bradley Meck\nversion: 2007.30.03\nsource: http://www.tiddlytools.com/\n***/\n/*{{{*/\nconfig.commands.collapseNote = {\ntext: "-",\ntooltip: "Collapse this note",\nhandler: function(event,src,title)\n{\nvar e = story.findContainingNote(src);\nif(e.getAttribute("template") != config.noteTemplates[DEFAULT_EDIT_TEMPLATE]){\nvar t = (readOnly&&store.noteExists("WebCollapsedTemplate"))?"WebCollapsedTemplate":"CollapsedTemplate";\nif(e.getAttribute("template") != t ){\ne.setAttribute("oldTemplate",e.getAttribute("template"));\nstory.displayNote(null,title,t);\n}\n}\n}\n}\n\nconfig.commands.expandNote = {\ntext: " | ",\ntooltip: "Expand this note",\nhandler: function(event,src,title)\n{\nvar e = story.findContainingNote(src);\nstory.displayNote(null,title,e.getAttribute("oldTemplate"));\n}\n}\n\nconfig.macros.collapseAll = {\nhandler: function(place,macroName,params,wikifier,paramString,note){\ncreateTiddlyButton(place,"-","Collapse all notes",function(){\nstory.forEachNote(function(title,note){\nif(note.getAttribute("template") != config.noteTemplates[DEFAULT_EDIT_TEMPLATE])\nvar t = (readOnly&&store.noteExists("WebCollapsedTemplate"))?"WebCollapsedTemplate":"CollapsedTemplate";\nstory.displayNote(null,title,t);\n})})\n}\n}\n\nconfig.macros.expandAll = {\nhandler: function(place,macroName,params,wikifier,paramString,note){\ncreateTiddlyButton(place,"expand all","Expand all notes",function(){\nstory.forEachNote(function(title,note){\nvar t = (readOnly&&store.noteExists("WebCollapsedTemplate"))?"WebCollapsedTemplate":"CollapsedTemplate";\nif(note.getAttribute("template") == t) story.displayNote(null,title,note.getAttribute("oldTemplate"));\n})})\n}\n}\n\nconfig.commands.collapseOthers = {\ntext: "ร",\ntooltip: "Expand this note and collapse all others",\nhandler: function(event,src,title)\n{\nvar e = story.findContainingNote(src);\nstory.forEachNote(function(title,note){\nif(note.getAttribute("template") != config.noteTemplates[DEFAULT_EDIT_TEMPLATE]){\nvar t = (readOnly&&store.noteExists("WebCollapsedTemplate"))?"WebCollapsedTemplate":"CollapsedTemplate";\nif (e==note) t=e.getAttribute("oldTemplate");\n//////////\n// ELS 2006.02.22 - removed this line. if t==null, then the *current* view template, not the default "ViewTemplate", will be used.\n// if (!t||!t.length) t=!readOnly?"ViewTemplate":"WebViewTemplate";\n//////////\nstory.displayNote(null,title,t);\n}\n})\n}\n}\n/*}}}*/
<div>\n<span class='toolbar' macro='toolbar +editNote expandNote collapseOthers closeOthers -closeNote'></span>\n<span class='title' macro='view title'></span>\n</div>
I developed much of the content on this site while teaching [[AM473]] in the fall of 2008. As I posted course content, I updated the list below. If you took the class, you might find this list useful (or at least... evocative).\n\n!!! December updates\n* 12/14/08: [[Lecture 32]] posted.\n* 12/13/08: [[Lecture 31]] posted.\n* 12/11/08: [[Lecture 28]] and [[Lecture 30]] posted. Typos and mistakes are a possibility!\n* 12/09/08: [[Lecture 27]] is posted.\n* 12/08/08: Homework #6 Solutions are posted on the [[Homework Solutions]] page. Unfortunately, a rather critical factor-of-2 mistake in Problem 4 trivialized it as originally written, so (a) if you did the homework, Problem 4 is essentially a free pass, and (b) I've given the solution to the (//far more educational//) problem as it __should__ have been stated.\n* 12/01/08: Typo fixed in [[Homework #6]], Problem 1, section (d), part (iii) -- $\shat{\sPi}_{symm}$ should have been $\shat{\sPi}_{anti}$.\n!!! November updates\n* 11/23/08: [[Homework #6]] ([[pdf|pdf/HW6.pdf]]) is posted. [[Homework #5 Solutions]] are posted on the [[Homework Solutions]] page.\n* 11/17/08: [[Homework #4b Solutions]] are posted on the [[Homework Solutions]] page. Sorry for the delay -- I just didn't notice I hadn't posted them!\n* 11/16/08: The ''Journal of Advanced Quantum Theory'' is proud to announce that article submissions are being considered. Guidelines for referees are available at\n*** [[Referee report form for Wikipedia articles]]\n*** [[Referee report form for expository articles]].\n*** [[Referee report form for research articles]].\n* 11/16/08: [[Lecture 29 Slides|pdf/Lecture 29 Slides.pdf]] are posted.\n* 11/15/08: The [[1st Annual Waterloo Advanced Quantum Theory Workshop]] has been announced!\n* 11/14/08: Fixed a silly error in [[Homework #5]] Problem 1(b) (spurious factor of $(\spi\shbar)^{-1}$).\n* 11/12/08: Added a note on [[How to cite arxiv.org articles online]]\n* 11/12/08: [[Supplement A: The finite difference method]] has been updated, with a minor sign error fixed, and links to Maple code and a printout.\n* 11/11/08: Updated [[Expository article]] page with a link to instructions for formulas in Wikipedia.\n* 11/11/08: Bug fixes on [[Homework #5]]: numbering of the parts on Problems 1 & 4 is fixed; in 1(e), $\sfrac{1~}{2\spi\shbar~}$ should have been (and now is) $2\spi\shbar$; in 4(d), clarified that the question is whether this transformation can be done "up to a global phase".\n* 11/11/08: The [[Project]] page has been updated with guidelines and expectations for all types of projects.\n* 11/10/08: The [[Project]] page has been updated with guidelines and expectations for the [[Expository article]]. Guidelines for the other projects will be up soon -- for now, those doing a research article should read the [[Expository article]] page, as much of it is applicable to the research project track as well.\n* 11/09/08: [[Lecture 24]] and [[Lecture 26]] are online.\n* 11/09/08: A supplementary note on [[the finite difference method|Supplement A: The finite difference method]] is posted... relevant to anybody seeking to do numerical wavefunction calculations.\n* 11/08/08: [[Lecture 23]] is online, with video. ''Also'', I made a sign error in defining the "acceleration" operator $\shat{A}_{x,p}$ in [[Lecture 21]], and didn't catch it until now. All relevant lectures have been fixed, and there is an explanation in [[Lecture 21]]. The sign error propagated to [[Homework #4b]], problem 1... and to the solutions as well. I'm leaving those alone for now (because it was a graded assignment), so be aware that the sign is incorrect there.\n* 11/06/08: [[Lecture 22]] is online... lots of extra explanation (& cool videos) beyond what I presented in class... but not mandatory reading! :)\n* 11/05/08: [[Lecture 25]] is online.\n* 11/04/08: [[Homework #5]] posted ([[pdf version|pdf/HW5.pdf]]).\n* 11/04/08: [[Homework #4b hints]] posted.\n* 11/03/08: [[Lecture 20]] and [[Lecture 21]] are online.\n* 11/02/08: [[Lecture 18]] and [[Lecture 19]] are online. Also, some typos in [[Lecture 16]] have been fixed.\n!!! October updates\n* 10/28/08: Fixed a sign error in problem 1 of [[Homework #4b]] __and__ [[Lecture 21 notes]]. Clarified Problem 2 (you gotta find the wavefunction for $\sket{q}$). Also, [[Lecture 21 notes]] enhanced with a bit of extra material on the $\shat{S}$ and $\shat{A}$ operators, which you will probably find useful on the homework.\n* 10/27/08: [[Homework #4b]] updated with extra detail, esp. in Problem 3a.\n* 10/27/08: [[Homework #4b]] is posted. Typos or small mistakes on my part are a theoretical possibility for the next 24 hours.\n* 10/25/08: [[Schedule]] tweaked again.\n* 10/25/08: Added missing 4(e) solution to [[midterm solutions|pdf/MidtermSolutions.pdf]].\n* 10/24/08: [[Midterm exam|pdf/Midterm.pdf]] and [[midterm solutions|pdf/MidtermSolutions.pdf]] are online. The solutions have not been fully proofread; if there are mistakes, I'll catch them while grading.\n* 10/24/08: [[Lecture 18 notes]] are online, after some delay.\n* 10/23/08: [[Lecture 3]] is online in final form (but I haven't proofread it, so caveat emptor!). All lectures from [[Lecture 3]] to [[Lecture 17]] are finished.\n* 10/22/08: [[FAQ]] updated with some questions that are relevant to the midterm. [[Homework #4a Solutions]] are posted. [[Midterm Study Guide]] is posted.\n* 10/22/08: [[Lecture 16]], [[Lecture 17]] (with bonus Trogdor!), and [[Lecture 6]] are online in final form.\n* 10/21/08: [[Lecture 15]] is online in final form (it's pretty long). Also, [[Lecture 19 notes]] are up (used to be Lecture 18 notes, but we didn't get to the harmonic oscillator). [[Lecture 16]] and [[Lecture 17]] are partially finished.\n* 10/20/08: [[Homework #4a hints]] posted. Also fixed a couple of typos in Problem 4b of [[Homework #4a]] (not relevant to the problem).\n* 10/19/08: ''Final exam scheduled: December 15th, 12:30-3:00 PM, RCH 211''.\n* 10/18/08: Solutions to [[Homework #3]] are up on the [[Homework Solutions]] page.\n* 10/16/08: Added point values to problems in [[Homework #4a]].\n* 10/15/08: Bug fix in [[Homework #4a]] -- in Problem 3, I had neglected to define the multidimensional Poisson bracket! Also added more explanation in Problem 4.\n* 10/15/08: [[Homework #4a]] is online. It has not been proofread; minor changes may occur until midnight of October 15!\n* 10/14/08: [[Lecture 16 notes]] are online.\n* 10/14/08: Fixed an error in [[Lecture 8]] and [[Lecture 8 notes]], where I'd been swapping $\sket\sleftarrow$ and $\sket\srightarrow$ throughout.\n* 10/13/08: Significant typo ($\stheta$ vs. $\stheta/2$) fixed in [[Lecture 4]], section entitled "Stern-Gerlach results are isomorphic to light polarization experiments".\n* 10/11/08: [[FAQ]] updated again...\n* 10/10/08: [[Lecture 14 notes]], [[Lecture 15 notes]], and preliminary [[Lecture 16 notes]] are online.\n* 10/9/08: [[Lecture 12]] and [[Lecture 13]] online in final form.\n* 10/8/08: ''Midterm scheduled: 6:45-8:15 PM, October 24, PHYS 235''\n* 10/8/08: New [[FAQ]]. [[Lecture 11]] is online in final form. [[Schedule]] updated (to reflect the fact that we're a week behind the original, ambitious, schedule).\n* 10/7/08: Updated [[Homework #3]] with the point values of each problem (no changes to the problems).\n* 10/5/08: [[Lecture 8]], [[Lecture 9]], and [[Lecture 10]] are all online in final form.\n* 10/5/08: [[TeX|pdf/Homework3.tex]] and [[pdf|pdf/Homework3.pdf]] versions of [[Homework #3]] are available. ''The authoritative version is the one on the website!'' These are for-your-convenience ''only'', and may not get updated with bug fixes, etc!!! USE AT YOUR OWN RISK.\n* 10/5/08: Fixed a small but significant problem in the last line of [[Homework #3]] -- the metric needed to be normalized (wasn't actually a metric!)\n* 10/4/08: Fixed the link to "Cauchy" in the last problem of [[Homework #3]].\n* 10/4/08: [[Schedule]] is updated slightly.\n* 10/4/08: Some new [[FAQ]]s.\n* 10/4/08: [[Homework #3]] is posted.\n* 10/3/08: Solutions to [[Homework #2]] are up on the [[Homework Solutions]] page.\n!!! September updates\n* 9/30/08: Several new [[FAQ]]s, including how to get lots of karma by fixing website problems.\n* 9/30/08: Posted a LaTeX version of Homework #2, for reference purposes (see [[FAQ]]).\n* 9/29/08: A list of (brainstormed) [[Good Paper Topics]] is up.\n* 9/28/08: There is now an option to do //one// project, rather than two. See [[Syllabus Addendum]]; I will have more details on the project requirements up soon.\n* 9/28/08: The [[Schedule]] has due dates for projects.\n* 9/28/08: There is a Google [[Calendar]].\n* 9/28/08: Lots and lots of new little notes, in a Wiki style, defining various terms. Over time, I hope to convert many words in lecture notes into hyperlinks.\n* 9/28/08: I've started a [[FAQ]] -- "frequently asked questions", for those of you not familiar with the term.\n* [[Slides for Lecture 1|pdf/Lecture 1 Slides.pdf]] are online\n* Rough drafts of [[Lecture 9 notes]] and [[Lecture 10 notes]] are online. They are very rough, and I will improve them substantially when I have time.\n* 9/25/08: [[Homework #2]]: Minor loophole plugged in 2f, and an important typo fix in 3f -- final result should be $P_{correct} = \sfrac12 + \sfrac14\ssum_n{|p(n)-q(n)|}~$.\n* 9/23/08: [[Lecture 8 notes]] are online.\n* 9/23/08: Reshuffled Lecture Notes to better agree with what we //actually// covered.\n* 9/23/08: Minor changes to part 4e, and a typo fix in 2b, for [[Homework #2]].\n* 9/22/08: [[Homework #2]] is posted. Also see [[Homework Policy]].\n* 9/20/08: [[Schedule]] is updated slightly.\n* 9/16/08: [[Lecture 4]] is online.\n* 9/15/08: [[Lecture 3]] is online.
This site is powered by [[TiddlyWiki|http://www.tiddlywiki.com]] <<version>>\n!Robin installed these plugins\n*[[ImageSizePlugin]]\n**used to specify a size, or percentage-of-note size, for images\n!I installed these plugins (need t(T)iddler -> n(N)ote find and replace):\n*[[InlineJavascriptPlugin]]\n**used for the [[DisplayControl]]\n**and for [[HideTags]] (used for slides)\n*[[TextAreaPlugin]]\n**deselect the edit contents, and adds ctr-f,ctrl-g,cmd-v search/replace to editing.\n*[[jsMathPlugin]]\n**this processes the [[LaTeX]]. The AJAX part had problems, so I put the jsmath load into the source directly.\n**inserted custom LaTeX/jsmath command abbreviations into plugin.\n*[[CollapsePlugin]]\n**[[CollapsedTemplate]]\n*[[RearrangeNotesPlugin]]\n*[[ListTaggedPlugin]]\n**used for folder/tag listings\n*[[AllTagsExceptPlugin]]\n**advanced checkbox to see system tags\n*[[CopyNotePlugin]]\n*[[DisableWikiLinksPlugin]]\n**remove checkbox so it's always on\n**this is very tricky in combination with [[jsMathPlugin]] and \sss pytw problem\n*[[FaviconPlugin]]\n*[[ReferencesPlugin]]\n*[[UploadPlugin]]\n**changed 3 txtUploadUserName -> txtUserName\n**changed 3 pasUploadPassword -> pasPassword and a line in Intitializations\n**commented out password checkbox\n**change index.html and directory chmod to 777\n**put B's logging script call in [[MarkupPostBody]]\n***make index executable so log script will run\n*[[RecentPlugin]]\n**set to show last 2\ncheck to make sure I didn't install any<<tag plugin>>and forget to list it here. Try using the [[PluginManager]].\n\n!I changed these notes to configure operation and appearance:\n*These control the content of several boxes:\n**[[SiteTitle]]\n**[[SiteSubtitle]]\n**[[SiteUrl]]\n**[[DefaultNotes]]\n**[[MainMenu]]\n**[[SideBarOptions]]\n**[[OptionsPanel]]\n***[[SideBarOptionsText]]\n**[[AdvancedOptions]]\n**[[SideBarTabs]]\n***[[TabContents]]\n***[[TabTimeline]]\n***[[TabAll]] - nope, for some reason this has the text built in. :(\n***[[TabTags]]\n**[[DisplayControl]]\n*These are css layout templates:\n**[[PageTemplate]]\n**[[ViewTemplate]]\n**[[EditTemplate]]\n**[[CollapsedTemplate]]\n*And these change the system and css options:\n**[[SystemConfig]]\n**[[StyleSheet]]\n***Trouble with [[MyColors]] conflicting with [[ColorPalette]]\n**[[StyleSheetPrint]]\nThe default config files are invisible and listed as [[ShadowNotes]]. These:\n*[[StyleSheetLayout]]\n*[[StyleSheetColors]]\nare augmented and overriden by the [[StyleSheet]]. If they change in the future, with updates, the old version content will likely have to be added to the new [[StyleSheet]]. \n\n!Evil raw html/javascript TW source code tweakage\n*edit cookie options, since setting them in [[SystemConfig]] overrides user cookies\n*maybe add ctrl-w accessKey -- just fooling around\n*comment out a couple of displayMessage s\n*switch line order in {{{config.macros.search.handler}}} for search button after search field\n*comment out tag prompt line in {{{config.macros.tags.handler}}}\n*Insert this just after body. (This starts jsmath)\n**{{{<scriipt src="jsMath/jsMath.js"></scriipt>}}}\n*add B's logging script call\n**make index executable so log script will run\n\n!edited {{{jsMath/easy/load.js}}}\n*changed default font scaling to {{{scale: 110}}} and {{{warn: 0}}}\n*remove doubleclick show\n*reduced vertical margins by adding {{{margin-top: 0.5em; margin-bottom: 0.5em;}}}\n*hide jsMath button\n\n!And finally\nI did a global find/replace of "t(T)iddler" -> "n(N)ote" in the base file or directory via editor or the noteify shell script. Make sure to do this when more plugins are installed, so they'll work.\n\nThen, save a bare copy, without folders or editing tips, and a minimal copy, with them. Then try to [[ImportNotes]].
/***\nAuthors: Eric Shulman\nversion: 2.1.2\nsource: http://www.tiddlytools.com/\nadds a "copy" option to duplicate a note\n***/\n/*{{{*/\nversion.extensions.copyNote= {major: 2, minor: 1, revision: 2, date: new Date(2007,5,17)};\nconfig.commands.copyNote = {\n text: '\sxA9',\n hideReadOnly: true,\n tooltip: 'Make a copy of this note',\n prefix: "Copy of ",\n handler: function(event,src,title) {\n var text=store.getNoteText(title); // get text from note (or shadow)\n var tags=[]; var tid=store.getNote(title); if (tid) tags=tid.getTags();\n var textfield=story.getNoteField(title,"text");\n if (textfield&&textfield.getAttribute("edit")=="text") var text=textfield.value; // edit mode, use field value\n var tagsfield=story.getNoteField(title,"tags");\n if (tagsfield&&tagsfield.getAttribute("edit")=="tags") var tags=tagsfield.value; // edit mode, use field value\n var newTitle = this.prefix + title;\n story.displayNote(null,newTitle,DEFAULT_EDIT_TEMPLATE);\n story.getNoteField(newTitle,"text").value=text;\n story.getNoteField(newTitle,"tags").value=tags;\n story.focusNote(newTitle,"title");\n return false;\n }\n};\n/*}}}*/
Welcome\nAM473
Dirac's notation serves two purposes. First, it is a particularly simple and powerful way of manipulating linear algebraic symbols. It adds no new mathematics, but lets us write equations involving vectors and matrices in a way that most physicists and other practitioners of quantum physics find particularly convenient. Second, Dirac's notation is well-adapted for the transition from finite-dimensional Hilbert spaces to infinite-dimensional Hilbert spaces, wherein we can't assume that state vectors can be represented by lists of numbers, and therefore the notation $\svec\spsi$ is potentially misleading.\n\nDirac notation involves exactly two new symbols:\n* A quantum state (e.g., a column vector $\svec{\spsi}$) is written as $\sket\spsi$. This is called a "ket".\n* A quantum measurement outcome (e.g., a row vector $\svec{\sphi}^\sdagger$) is written as $\sbra\sphi$. This is called a "bra".\n\nA couple of simple properties need to be stated here:\n* First, you should recall that when we map a state into a measurement outcome by transposing it, we //must// conjugate its complex numbers as well. This is the Hermitian transpose or //adjoint// operation, written $A \srightarrow A^\sdagger$, or $\svec\spsi \srightarrow \svec\spsi^\sdagger$ when applied to states (column vectors). //This is assumed in Dirac notation//. In other words, a bra is the adjoint of a ket: $(\sket\spsi)^\sdagger \sequiv \sbra\spsi$. So if you ever find yourself writing things like $\sbra{\spsi^\sdagger}$ or $\sbra{\spsi^*}$, you're probably dangerously confused!\n* Since kets and bras are just column/row vectors, they can be multiplied by complex scalars. So, notation like $\salpha\sket\spsi$ or $\sbeta\sbra\spsi$ is perfectly appropriate.\n* Remember, however, that in Dirac notation when we transform a ket into a bra, the implicit complex conjugation applies //only// to the ket itself, not to its coefficient! In other words\n$$(\salpha\sket\spsi)^\sdagger = \salpha^*\sbra\spsi,$$\nand\n$$(\sbeta\sbra\spsi)^\sdagger = \sbeta^*\sket\spsi.$$\n\n!!Some rules for using Dirac notation.\n\nThat's really all there is to Dirac notation, but most of us find it quite useful. The critical thing to remember is that bras and kets are just symbols for row and column matrices, so the rules for their combination and evaluation are //exactly// those of linear algebra. Here are the most important ones:\n1. The most basic and useful convention is to write an inner product between a vector and a dual vector as\n$$z = \svec\sphi^\sdagger\svec\spsi = \sbraket{\sphi}{\spsi}$$. This is called a //braket//, pronounced "bracket". A bracket is an //inner product//, also known as the //scalar product// of two vectors (because it produces a scalar -- both in the math sense of "a single number" and in the physics sense of "a quantity that is invariant under change of basis or reference frame"). \n\n2. We also have an //outer product//:\n$$\svec\sphi\svec\spsi^\sdagger = \sketbra{\sphi}{\spsi}.$$\nThis notation is initially confusing to a lot of folks. If you remember linear algebra, then you should observe that $\sket\sphi$ is a $d\stimes1$ matrix and $\sbra\spsi$ is a $1\stimes d$ matrix, and therefore their product is a $d\stimes d$ matrix. Thus, $\sketbra{\sphi}{\spsi}$ is a matrix -- also known as an //operator//, because it acts or "operates" on vectors, transforming them into other vectors. In the abstract, an operator is defined as a linear map from vectors to vectors (and a dual vector is defined as a linear map from vectors to scalars). For most of this course (and always unless explicitly stated otherwise) all of our operators will be //square// -- i.e., $d\stimes d$ matrices, or (in infinite dimensions) maps from a Hilbert space to //itself// (rather than to some other Hilbert space.\n\n3. Matrix multiplication is //associative//, which means you can break up a string of matrices (including bras and kets) with parentheses wherever you like, e.g.:\n$$\sketbra{a}{b}\sketbra{c}{d} = (\sketbra{a}{b})(\sketbra{c}{d}) = \sket{a}(braket{b}{c})\sket{d}$$\nOne very important example of this illustrates why $\sketbra{\sphi}{\spsi}$ is an operator:\n$$\sleft(\sketbra{\sphi}{\spsi}\sright)\sket{\smu} = \sket\sphi\sleft(\sbraket{\spsi}{\smu}\sright) = \sleft(\sbraket{\spsi}{\smu}\sright)\sket\sphi$$\nWhy did we move $\sbraket{\spsi}{\smu}$ to the left of the ket $\sket\sphi$? Because ''a bracket is a complex number!'' In matrix multiplication, scalars commute with each other and with everything else. So we move the number to the left to emphasize the fact that the result of $\sleft(\sketbra{\sphi}{\spsi}\sright)\sket{\smu}$ is of the form $z\sket\sphi$ -- i.e., a scalar times a state vector. This brings us to an absolutely critical point:\n\n4. ''Matrix multiplication is not generally //commutative//''. Yes, I'm shouting. This is important. You ''cannot'' generally change the order of bras and kets, because they are matrices and matrices do not generally commute! The only general exception to this is that a scalar -- in particular, a bracket $\sbraket{\sphi}{\spsi}$ -- commutes with everything. Otherwise, you can't change the order. Consider, for instance\n\sbegin{eqnarray}\n\sleft(\sketbra{a}{b}\sright)\sleft(\sketbra{c}{d}\sright) = \sketbra{a}{b}\sketbra{c}{d} = \sbraket{b}{c}\sketbra{a}{d} && \s\s\n&\sneq& \s\s\n&& \sleft(\sketbra{c}{d}\sright)\sleft(\sketbra{a}{b}\sright) = \sketbra{c}{d}\sketbra{a}{b} = \sbraket{d}{a}\sketbra{c}{b}.\n\send{eqnarray}\n\n5. Matrix multiplication is //distributive// over addition. In other words:\n$$\sleft(\salpha\sbra{a}+\sbeta\sbra{b}\sright)\sket{c} = \salpha\sbraket{a}{c} + \sbeta\sbraket{b}{c}$$\nand\n$$\sbra{a}\sleft(\sbeta\sket{b}+\sgamma\sket{c}\sright) = \sbeta\sbraket{a}{b} + \sgamma\sbraket{a}{c}$$\nand even\n$$\sleft(\salpha\sbra{a}+\sbeta\sbra{b}\sright)\sleft(\sgamma\sbra{c}+\sdelta\sbra{d}\sright) = \salpha\sgamma\sbraket{a}{c} + \sbeta\sgamma\sbraket{b}{c} + \salpha\sdelta\sbraket{a}{d} + \sbeta\sdelta\sbraket{b}{d}$$\n\n6. If you "dagger" (i.e., take the adjoint of) a sequence of bras and kets, the result is a sequence where\n* Every bra turns into a ket, and every ket turns into a bra, and\n* The order of the terms is reversed.\nSo, for example:\n$$\sleft(\sbraket{\spsi}{\sphi}\sright)^\sdagger = \sbraket{\sphi}{\spsi}$$\n$$\sleft(\sketbra{\spsi}{\sphi}\sright)^\sdagger = \sketbra{\sphi}{\spsi}$$\n//Exercise: prove this.//\n\n7. The outer product of a state $\sket\spsi$ with //its own// dual vector $\sbra\spsi$ is special. We call $\sproj{\spsi}$ the //projector// onto $\sket{\spsi}$, because of how it acts as an operator. Suppose we have some random state $\sket{\sphi}$. Since it is a vector, it can be decomposed into the sum of a term //parallel// to $\sket\spsi$ and a term //orthogonal// to $\sket\spsi$, i.e.\n$$\sket\sphi = c_1\sket\spsi + c_2\sket{\soverline\spsi}$$\nwhere $\sbraket{\soverline\spsi}{\spsi}=0$. Now, what do we get if we "act on $\sket\sphi$" with $\sproj{\spsi}$ -- meaning that we multiply $\sket\sphi$ by $\sproj{\spsi}$ on the left:\n$$\sproj\spsi \sket\sphi = \sproj\spsi\sleft(c_1\sket\spsi + c_2\sket{\soverline\spsi}\sright) = c_1\sket\spsi$$\n[where we've used the fact that $\sbraket{\spsi}{\spsi}=1$ (states are normalized) and $\sbraket{\spsi}{\soverline{\spsi}}=0$ (orthogonality)]. We see that the operator $\sproj\spsi$ has stripped away the part of $\sket\sphi$ that is orthogonal to $\spsi$, but left the part that is parallel, thus //projecting// onto the axis defined by $\sket\spsi$.\n//Exercise: prove that $\sproj{\spsi}$ is Hermitian.//\n\n
/***\nAuthors: Eric Shulman\nversion: 1.0.0\nsource: http://www.tiddlytools.com/\nThis plugin allows you to disable TiddlyWiki's automatic WikiWord linking behavior, so that WikiWords embedded in note content will be rendered as regular text, instead of being automatically converted to note links. To create a note link when automatic linking is disabled, you must enclose the link text within {{{[[}}} and {{{]]}}}.\n!!!!!Code\n***/\n//{{{\nversion.extensions.disableWikiLinks= {major: 1, minor: 0, revision: 0, date: new Date(2005,12,9)};\n\n// G changed to have this on, without checkbox\nconfig.options.chkDisableWikiLinks= true;\n\n// find the formatter for wikiLink and replace handler with 'pass-thru' rendering\nfor (var i=0; i<config.formatters.length && config.formatters[i].name!="wikiLink"; i++);\nconfig.formatters[i].coreHandler=config.formatters[i].handler;\nconfig.formatters[i].handler=function(w) {\n // if not enabled, just do standard WikiWord link formatting\n if (!config.options.chkDisableWikiLinks) return this.coreHandler(w);\n // supress any leading "~" (if present)\n var skip=(w.matchText.substr(0,1)==config.textPrimitives.unWikiLink)?1:0;\n w.outputText(w.output,w.matchStart+skip,w.nextMatch)\n}\n//}}}
<script label="O" title="toggle sidebar">\n var sb=document.getElementById('sidebar');\n var da=document.getElementById('displayArea');\n if (sb.style.display == 'none') {\n da.style.marginLeft = '18.5em';\n sb.style.display = 'block';}\n else {\n da.style.marginLeft = '0em';\n sb.style.display = 'none';}\n</script> <script label="O" title="toggle title">\n var h=document.getElementById('head');\n if (h.style.height == '1em') {\n h.style.height = '5.8em';}\n else {\n h.style.height = '1em';}\n</script> \n
<div class='toolbar' macro='toolbar +saveNote copyNote deleteNote closeOthers -cancelNote'></div>\n<div class='title' macro='view title'></div>\n<div class='editor' macro='edit title'></div>\n<div class='editor' macro='edit text'></div>\n<div class='toolbar' macro='toolbar +saveNote copyNote deleteNote closeOthers -cancelNote'></div>\n<div class='editorFooter'><span macro='tagChooser'></span><span macro='message views.editor.tagPrompt'></span></div>\n<div><span class='editor' macro='edit tags'></div>
!!Guidelines\n*Keep the notes short — if it gets long, split it. Somewhere between a paragraph and a page is about right.\n*Don't use objects or methods without linking references (once) for the reader.\n*The first time you reference another object in a note, link to its eponymous note.\n**If you reference an object defined in a non-eponymous note, link to it again by [ [ object | note ] ]. \n*If you define some object, make the object name ''bold''.\n**If you include a pseudonym for a defined object, make it //''bold italic''//.\n*Include all steps of calculations, with the manipulations established from referenced notes.\n*Link to [[Wikipedia|http://en.wikipedia.org/]] for standard stuff, or to arxiv papers and other things as needed.\n*Include introduced symbols in the [[Symbols]] table.\n*Include introduced tags in [[Tags]] and<<tag folder>>. //Don't do this often//\n*Examples are OK, even if they run a little long.\n*It is good to start a note as "A ''this thing'' is a kind of [[link to more general case]] which //more detailed properties//..."\n*In general, don't link "down" to more specific instances -- that's what the "referenced by" button is for. Unless it seems pedagogically useful.\n*Lateral (mutual) linking is OK, such as in the [[differential form]] note — for similar objects, special generalized cases, or extended discussions.\n*Attempt to duplicate the same structure of links for similar structures of mathematical objects and instantiations.\n*Links that establish a hierarchical relationship should probably be treated differently then links to define used objects -- but for now they're the same.\n*If you link to a note that doesn't exist, create that note and tag it with<<tag 0>>if you leave it empty.\n*Tag notes that desperately need editing with<<tag 0>>.\n*Put editorial comments (//like this one//) in parens and italics.\n!!Markup\nThere are many markup formating commands and features that can be used when <<tag editing>> notes:\n<<ListTagged editing>>\nYou can do other neat stuff with more <<tag plugin>>s.\n
Images can be included by their filename or full URL. It's good practice to include a title to be shown as a tooltip, and when the image isn't available. An image can also link to another note or or a URL\n[img[Romanesque broccoli|images/fractalveg.jpg][http://www.flickr.com/photos/jermy/10134618/]]\n{{{\n[img[Romanesque broccoli|images/fractalveg.jpg]\n [http://www.flickr.com/photos/jermy/10134618/]]\n[img[title|filename]]\n[img[filename]]\n[img[title|filename][link]]\n[img[filename][link]]\n}}}\n[<img[Forest|images/forest.jpg][http://www.flickr.com/photos/jermy/8749660/]][>img[Field|images/field.jpg][http://www.flickr.com/photos/jermy/8749285/]]You can also float images to the left or right: the forest is left aligned with {{{[<img[}}}, and the field is right aligned with {{{[>img[}}}.\n@@clear(left):clear(right):display(block):You can use CSS to clear the floats@@\n{{{\n[<img[Forest|images/forest.jpg][http://www.flickr.com/photos/jermy/8749660/]]\n[>img[Field|images/field.jpg][http://www.flickr.com/photos/jermy/8749285/]]\nYou can also float images to the left or right:\n the forest is left aligned with {{{[<img[}}},\nand the field is right aligned with {{{[>img[}}}.\n@@clear(left):clear(right):display(block):\nYou can use CSS to clear the floats@@\n}}}
This project comes in two flavors:\n# A 3-4 page printed article, modeled after a Physical Review Letters paper (see, for instance, <http://arxiv.org/abs/quant-ph/0205033> or <http://arxiv.org/abs/0704.3615> as models for formatting and size -- not content! Both of these are on the big end, and both are research articles rather than expository articles.),\n# An article on Wikipedia, or a cluster of pages on Wikipedia.\n\nThe point of this project (whichever form you choose) is to demonstrate that you've explored a topic beyond the canon of the course. I've suggested a variety of topics (see [[Good Paper Topics]]), but these are merely suggestions, based on my own interests. If you write an article in paper format, then its sole purpose is as a learning exercise for you, and as a demonstration to me that you've explored a topic on your own. If you write a Wikipedia article, it will also serve the world community, by making something clear that was not previously clear.\n\nSome general guidelines:\n* You don't need to know everything about your topic! Some of the possible topics are tremendous, and books have been written about them. The point is to get a good general idea of what's going on with that problem, topic, or concept, and explain it in relatively straightforward language. Detailed calculations are not apropos, and you are not responsible for knowing how to do them! What you are responsible for is a general understanding of "what is the point," and "how does it work." Your article should cover most of the following points to one degree or another:\n** Why is this an interesting topic?\n** What areas of mathematics or physics is it related to?\n** What are the major results, and/or major open problems related to this?\n** What is it good for? Who uses this branch of theory, or experimental technique?\n* You should eschew technical derivations when possible, and focus instead on explaining the topic to an audience of your peers. You do not need to prove things! There are books out there in which proofs can be found. You should cite those books, papers, etc.\n* This is an expository article -- i.e., an article that seeks to explain something. Therefore, you should write in a literate fashion -- correct grammar, spelling, punctuation, etc. You should adopt a reasonably formal tone: this is not a letter to a friend. However, it's also not a letter to a lawyer... strive for comprehensibility above all else. Short sentences are good, if you are in doubt.\n* Do not copy the style of my lecture notes! I write quite informally in these notes -- it is my goal to have a conversation with you. Your paper needs to be a bit more formal than that.\n!!! Guidelines for non-Wikipedia articles\n\n* You should cite your sources. I will not be picky about //how// you cite them (i.e., what style you use, etc), but the citations need to be consistent and sufficient for me or someone else to track down the original sources. You may use a single source, although I would prefer more than one. There is no upper limit on the number of sources you may cite, and you do not have to have read something in order to cite it! Citing an original paper just because one of your sources cited it is fine. Don't go overboard, though -- more than about 30 sources total is probably overkill.\n* I highly recommend writing in LaTeX. You must use a computer program of some sort -- I will not accept handwritten papers. If you choose to use Word or another program, that's okay, but please submit a PDF of your paper for (a) peer review, and (b) final grading. If you find this absolutely impossible, I will accept a pristine hardcopy, but would much prefer a PDF.\n* The minimum reasonable length is 2 pages, and the maximum is 5. See <http://arxiv.org/abs/quant-ph/0205033> or <http://arxiv.org/abs/0704.3615> for guidelines on font size, margins, etc. Please format your paper single-spaced, in two columns if possible. The easiest way by far to do this is to use LaTeX and use the "revtex4" document class that is used for Physical Review. I've included a [[sample TeX file|pdf/ArticleTemplate.tex]] that you can crib from -- this is actually just the source for <http://arxiv.org/abs/quant-ph/0205033> with most of the content snipped out.\n\n!!! Guidelines for Wikipedia articles\n\n//(see also [[Some pointers for Wikipedia articles]])//\n\n* In order to write a Wikipedia article, you'll have to be somewhat familiar with Wikipedia. A good place to start is <http://en.wikipedia.org/wiki/Wikipedia:Questions>, and your next stop might be <http://en.wikipedia.org/wiki/Wikipedia:FAQ>. You'll certainly want to create a user account <http://en.wikipedia.org/wiki/Special:UserLogin/signup>. Then you should read about creating an article at <http://en.wikipedia.org/wiki/Wikipedia:Your_first_article>, with more information at <http://en.wikipedia.org/wiki/Wikipedia:Starting_an_article>. When you're writing, you may find <http://en.wikipedia.org/wiki/Wikipedia:Cheatsheet> useful. Since you'll be including some math, you'll probably find <http://en.wikipedia.org/wiki/Help:Displaying_a_formula> useful as well.\n* Style is somewhat important. This is an encyclopedia, and should have a fairly uniform style. You should look at some of the good articles on related topics, and absorb the style presented there. You will rarely get in trouble for being __too__ dry (passive voice is always acceptable in the sciences, though a good writer will try to minimize it), but you should avoid using "You" at all costs, and minimize your use of "we".\n* Some good articles to use as role models (note: some of these are far too big to serve as a project, but it's the ''style'' I'm focusing on here!): <http://en.wikipedia.org/wiki/Quantum_mechanics>, <http://en.wikipedia.org/wiki/Lagrangian_mechanics> (good example of how to do pedagogical mathematical derivations), <http://en.wikipedia.org/wiki/Hydrogen-like_atom> (a somewhat rougher example of pedagogical derivations... I think this article should use "we" and "Let us" less), <http://en.wikipedia.org/wiki/Momentum>, <http://en.wikipedia.org/wiki/Phase_space> (a bit short and rough), <http://en.wikipedia.org/wiki/Hilbert_space> (pretty mathematical), <http://en.wikipedia.org/wiki/Hydrogen_atom>, <http://en.wikipedia.org/wiki/Wigner_function>, <http://en.wikipedia.org/wiki/Bloch_wave> (a bit short -- if this was a project submission I'd want another related page too, like maybe <http://en.wikipedia.org/wiki/Kronig-Penney_model>). There are bajillions of other ones.\n* Wikipedia has a standard policy and style for citing sources. Again, you don't need more than one source, but you do need at least one.\n* You should submit your article for review by sending me a link to a particular revision of your article (or of your user page, if you are hosting your article there while it is being polished). This is very important! If you send me a link that is ''not'' a permalink to a particular revision, then it may change without notice while it's being reviewed, which will not be good for you.\n\n!!! Peer review\n\nYou will submit your article to me by 6:00 PM on Friday, November 14, and I will send it out to one or more of your colleagues for peer review. ''This is not optional.'' Do not skip this deadline. Your article should be essentially finished by the time it is submitted for review! Or, at least, you should think it's acceptable. The point of review is to point out things that you missed, errors you didn't notice, and suchlike. You'll have about a week after it's returned from review to spruce and polish your article before you submit it for final evaluation.\n\nIf you have left the project until the last minute, that's unfortunate. You should engage in what is called "triage", which means doing the most important things first. The most important aspect of your article to have done for peer review is ''content''. You should have the major content of your article in place. This does not mean that you should throw together a bunch of random notes and submit them! While I will be grading primarily based on the final product, submitting a steaming heap of **** for review will not be entirely ignored in the final evaluation! You should prioritize content, but patch it together with at least a thin veneer of style! \n\nThis is even more true for Wikipedia articles: poorly written articles run the risk of being deleted with prejudice by the Wikipedia administrators. I will not hold ''you'' responsible for what other people add to or subtract from your article... but if you write a sufficiently poor article that it gets nuked by the admins, ''I'' am not responsible. At the very least, you should keep a copy on your user page or something similar, in case of emergencies. I'll be sympathetic to claims of unfairness, but my response will depend on how good the article actually is. I'm afraid I won't go easy on an incoherent article, even though I want you to succeed!\n\nI will ask the reviewers to comment on the following aspects of the article:\n* ''Suitability'': Is this a suitable topic for this project? Is it within the scope of "quantum theory", but not something that we already covered in detail in the course? Does the author appear to have engaged with the topic at the level of this class, or are the issues discussed too elementary and simple?\n* ''Correctness'': Does the article appear to be factually correct? Do you see any glaring errors, or really implausible statements? Do you see any mathematical errors or typos?\n* ''Completeness'': Does the article appear to be of an appropriate size? If too small, what needs to be added? If too big, what can be dropped? Are there glaring omissions, where at least a reference or link should be made?\n* ''Writing'': Is the article readable and engaging? Did it successfully explain the topic, or at least part of it, to you? Are there any sections that need to be rewritten extensively?\n* ''Technical'': Is the spelling, grammar, punctuation, and bibliographic information correct? Are there any areas that need improvement?\n* ''Style:'' Is the style of the article appropriate for its forum? Is it accessible to an advanced undergraduate math/physics major?\n\n[[Referee report form for Wikipedia articles]]\n[[Referee report form for expository articles]]
''Bold''\n{{{''Bold''}}}\n==Strikethrough==\n{{{==Strikethrough==}}}\n__Underline__ \n{{{__Underline__}}}\n//Italic// \n{{{//Italic//}}}\n2^^3^^=8 \n{{{2^^3^^=8}}}\na~~ij~~ = -a~~ji~~ \n{{{a~~ij~~ = -a~~ji~~}}}\n@@highlight@@ \n{{{@@highlight@@}}}\n\n//The highlight can also accept CSS syntax to directly style the text://\n@@color:green;green coloured@@\n{{{@@color:green;green coloured@@}}}\n@@background-color:#ff0000;color:#ffffff;red coloured@@\n{{{@@background-color:#ff0000;color:#ffffff;red coloured@@}}}\n@@text-shadow:black 3px 3px 8px;font-size:18pt;display:block;margin:1em 1em 1em 1em;border:1px solid black;Access any CSS style@@\n{{{@@text-shadow:black 3px 3px 8px;font-size:18pt;display:block;margin:1em 1em 1em 1em;border:1px solid black;Access any CSS style@@}}}\n@@display:block;text-align:center;centered text or image@@\n{{{@@display:block;text-align:center;centered text or image@@}}}\n\n//For backwards compatibility, the following highlight syntax is also accepted://\n@@bgcolor(#ff0000):color(#ffffff):red coloured@@\n{{{\n@@bgcolor(#ff0000):color(#ffffff):red coloured@@\n}}}
! Homework, grading, and the course in general\n\n[["The homeworks are really hard! and I can't/won't/don't work in a group!"|FAQ: Challenging homeworks]]\n\n[["Homework #2 is giving me fits. Any suggestions?"|FAQ: HW2]]\n\n[["Can I earn kudos, karma, and participation points by fixing and improving how the course website works?"|FAQ: TiddlyWiki fixes and development]]\n\n! Quantum mechanics in finite-dimensional Hilbert space\n\n[["What's up with vector operators? If I have an `vector' operator acting on a state, does it return a row vector or a column vector as its eigenvalue?"|FAQ: Vector operators]]\n\n[["Looking at Lecture 5, it looks like you rewrite Born's rule just to make it look like the classical probability rule. Then you introduce a new view of states and measurement outcomes, as operators (vectors in Hilbert-Schmidt space). Did you rewrite Born's rule just to make it look like the classical probability? or was there something wrong with the original form of Born's Rule?"|FAQ: Bilinear form of Born's Rule]]\n\n! Quantum mechanics in $L^2(\sreals)$\n\n[["What do we need to know about rigged Hilbert space for the midterm?"|FAQ: Essentials of RHS]]\n\n[["What the heck are |FAQ: Subset structure of RHS]]$\sPhi'$, $\sPhi^*$, and $\sPhi^x$?"\n\n[["When we translate a state around a closed loop in phase space, and get a Berry's phase, which operator contributes the phase?|FAQ: Berry's Phase]]\n\n! Probability theory, indicator functions, observations, and Bayes' Rule\n\n[[I noticed that you defined p( x ) differently in two of your lectures. In Lecture 12 (Transformations on Quantum Pure States) you define p( x ) as a "probability distribution function", whereas in Lecture 14 you defined p( x ) as a "probability distribution". Is there any significance to this or is it just an error?|FAQ: Probability distributions]]\n\n[["When I make an observation, I observe one event, right? So if I write that observation as a set of indicator functions, should I just include the ones corresponding to the event that I actually observed?"|FAQ: What are observations?]]\n\n[["Probability theory and Bayes' Rule are kicking my butt. Where can I learn and practice more?|FAQ: Probability and Bayes' Rule]]\n\n! LaTeX, and technical issues with the course website.\n\n[["Why doesn't the source code from your lecture notes or the homework set compile in LaTeX?"|FAQ: TiddlyWiki and LaTeX]]\n\n[["What LaTeX macros are you using in your notes?"|FAQ: LaTeX Macros]]\n\n[["I'm having trouble getting LaTeX to work, and it needs the amsmath package, and your LaTeX macros have bugs in them anyway! Could you provide a template?"|FAQ: LaTeX template]]\n\n[["When I try to print pages from the course website, or export to PDF, weird stuff happens with the symbols!"|FAQ: TiddlyWiki printing]]\n\n! Miscellaneous questions\n\n[["How should I cite (and link to) scientific articles if I'm writing an email, a web page, or another informal document?|How to cite arxiv.org articles online]]\n\n[["How should I go about reviewing a paper for the Journal of Advanced Quantum Theory?|FAQ: Refereeing]]
The assignments are rather difficult because it's a challenging subject. They are chosen to be relatively simple -- unlike the homework problems for a graduate course in quantum mechanics, where anything goes and you will typically be asked to solve realistic physical problems! However, this is a 4th year capstone course, and the assignments are qualitatively different from those in lower-division classes. They are *not* (by and large) mechanical reproductions of problems you've seen solved in class; instead, you need to figure out how to apply the theory presented in class and in the notes to solve the problem.\n\nThis is often an intellectual challenge. Like almost all intellectual challenges, these problems get much easier when you attack them in groups. Collaboration is a tremendously valuable asset in every field and profession, and absolutely essential in physics. If you work with other people, it will probably make the assignments much easier and more productive -- but finding those other people is a task you'll have to accomplish on your own. It's also a very valuable skill down the road. I suspect there are a number of other people in the class who would welcome a study partner. It may be as simple as asking around until you find such a person.\n\nIn general, *every* part of quantum theory is related -- often quite closely -- to "pure" math. There's little practical distinction between pure and applied math. For this course, I'm assuming familiarity with the *basic* ideas of calculus, differential equations, linear algebra, probability, and analysis -- all of which are in the canon of physics and applied math (i.e., it's essentially impossible to do physics without them). Group theory and basic functional analysis are also tied in closely with quantum mechanics, and I'm presenting the essentials in class. However, for a *full* understanding of what we're doing, you'll have to read up outside. As always, Wikipedia is a free and excellent reference for virtually all the math we are using.\n
Some good discussion and examples of Bayes' Rule:\n\n1. Wikipedia's entry is good at http://en.wikipedia.org/wiki/Bayes%27_theorem.\n\n2. David MacKay's book at [[http://www.inference.phy.cam.ac.uk/mackay/itila/book.html>, specifically Chapters 2 <http://www.inference.phy.cam.ac.uk/mackay/itprnn/ps/22.40.pdf> and 3 <http://www.inference.phy.cam.ac.uk/mackay/itprnn/ps/47.59.pdf>, and the exercises therein (especially those at the beginning of Chapter 3).\n\n3. Another online resource is <http://cnx.org/content/m10985/latest/>\n\n4. Yet another is <http://www.dcs.qmul.ac.uk/~norman/BBNs/Bayes_rule.htm> and <http://www.dcs.qmul.ac.uk/~norman/BBNs/Bayes_Rule_Example.htm>\n\n5. And possibly <http://www.cim.mcgill.ca/~friggi/bayes/>\n\n
First of all, go read the guidelines for the [[Research project]] or [[Expository article]], depending on which kind of submission you're reviewing. Then, go grab (and read) the relevant referee report form:\n* [[Referee report form for research articles]]\n* [[Referee report form for expository articles]]\n* [[Referee report form for Wikipedia articles]]\nIf you're reviewing a Wikipedia article, then you might want to scan [[Some pointers for Wikipedia articles]], to see if the author has slipped up on any basic features.
''YES!''\n\nLaTeX is the lingua franca of the physics and math communities, and it will remain so for some time. However, as of 2008, LaTeX produces great PDF and paper output -- but //not// good web or hypertext output. You can use the {{{hyperref}}} package to make a linked PDF, but right now this is a kludge; hyperlinked PDF is not a particularly friendly format, and it's very limited compared with what can be done using Web 2.0 technologies. The future of written communication -- particularly in science -- is on the web. Thus, this website. It's my experiment (of the jumping-in-with-both-feet variety) in improved scientific communication.\n\nThis kind of technology is still in the Wild West phase (Lewis and Clark was a few years ago). There's no standard, no consensus, and no obviously optimal package. Therefore, the technology we're using for this course is eminently buggy. And has design flaws. And can be improved. By improving it, you will help me, your classmates and colleagues, the other scientific users of TiddlyWiki, and the entire scientific community. What are you waiting for?\n\nThis page (which I'll continue to update) lists some of the possible improvements. TiddlyWiki is not only open source, but also //very// plugin-friendly. If you poke around the website, particularly [[here|Configuration]], you'll find a bunch of plugins, and you can look at them directly. WYSIWYG, except for jsMath, which is the JavaScript interpreter for LaTeX and has a bunch of code and font files hidden away outside of this web page. Everything else is right here for you to look at. At [[www.tiddlywiki.org|http://www.tiddlywiki.org/]], you can find lots more information on TiddlyWiki.\n\n# The world needs a script that will translate TiddlyWiki markup into compilable LaTeX. It's eminently doable in Python, Perl, or your favorite text processing language. The math stuff -- {{{$a^2=b^2+c^2$}}} is totally portable, but things like //italics// (TW: {{{//italics//}}} vs LaTeX: {{{\semph{italics} }}}) need to be translated. A longer (but not complete) list can be found at [[FAQ: TiddlyWiki and LaTeX]].\n# A related project is to obtain or design a plugin that will convert a TiddlyWiki note (or perhaps several of them) directly to PDF. Perhaps going through LaTeX on the way. This would be a great enhancement to the current protocol of using the print function in your browser.\n# Figure out why printing occasionally fails, and fix it or design a reliable workaround procedure.\n# Figure out why fractions sometimes print without the bar between numerator and denominator, and fix it.\n# A somewhat harder challenge is to write a script that translates LaTeX into TiddlyWiki. LaTeX is much more powerful, so this would involve some design decisions (what do we do, gracefully, when the source file has LaTeX constructs for which TiddlyWiki has no equivalent?). The goal here would be:\n** for any LaTeX code that was created by exporting from TiddlyWiki, faithfully translate it //back// to TiddlyWiki, and\n** for more complex documents, provide a reasonable interpretation of the original that can be tweaked by human editing.\n# Make it possible to display PDF in TiddlyWiki. There may already be a plugin.\n
Yeah, there are at least three things that occasionally go wrong with printing from TiddlyWiki, for as-yet-non-understood reasons. \n# The math symbols -- or at least the ones after the first page -- may disappear.\n# The math may appear as LaTeX source code rather than nice-looking math.\n# The bar between numerator and denominator of fractions may periodically disappear.\n\nThe first and second problems seem to be related. One thing that reliably causes them (especially #2) is to load a page (i.e., a note within the wiki, like the homework set), and print it quickly. It takes a while to process the math on the page, and if you print before it's all processed, weird things happen (including getting the LaTeX source for math instead of the math itself, or not getting anything).\n\nThere also seem to be occasional misfires for other reasons, possibly having to do with the browser and/or printer driver.\n\nSo, the first thing to do is:\n(1) Reload the note you want to print, and wait a bit (60 seconds should be more than enough),\n(2) Try to print it\n(3) If that doesn't work the first time, try it again (possibly from a different browser).\n\nIf the problem is consistent, let me know. I think that printing is working for most people, but in the last resort I can print the notes to PDF and upload them to the site. I *don't* want to do this if I can avoid it, because then we lose the whole point of dynamically editable content (which is that I can improve it as we go along).\n\nIncidentally, figuring out what causes these sorts of things and suggesting a fix -- or various other research projects related to TiddlyWiki and its use in this course -- are *excellent* ways to achieve mondo participation karma. Not to mention the gratitude and respect of the instructor. See [[FAQ: TiddlyWiki fixes and development]].\n\nI have no idea what causes the minor but annoying (but repeatable) issue where fractions occasionally lose the line between upper and lower number, even when the printing is otherwise perfect. It's probably not a big deal except in some very obscure circumstances.\n
Oh, this is a good one! We try to avoid actually using operators like $\svec{\smathbf{J}}$ directly, for a couple of reasons. The most important is that it's confusing! So we virtually always -- as in [[Homework #3]] -- pick out a particular component of $\svec{\smathbf{J}}$ (like $\smathbf{J}_x$ or $\svec{\smathbf{J}}\scdot\svec{n}$) before actually //using// the operator. Another reason is that ''$\svec{\smathbf{J}}$ doesn't actually //have// any eigenvalues!'' (see below)\n\nThat being said, here's the way you need to think about it. An operator with a vector sign over it is actually a vector of operators. The vector space in which it lives is *not* a Hilbert space for the system, nor does it have anything to do with that Hilbert space! It's usually the three-dimensional space that we live in (i.e., with basis vectors $\shat{x}$, $\shat{y}$, and $\shat{z}$). The definition of the angular momentum operator, for instance, is\n$$\svec{\smathbf{J}} = \smathbf{J}_x\shat{x} + \smathbf{J}_y\shat{y} + \smathbf{J}_z\shat{z}$$\nSo, you see, this is not necessarily a row //or// a column vector -- it depends on how you want to write your vectors in space. Most importantly, it's certainly not a row or column vector in the system's Hilbert space! If you were to act on a state $\sket\spsi$ with the $\svec{\smathbf{J}}$ operator, you would get\n$$\svec{\smathbf{J}}\sket{\spsi} = \smathbf{J}_x\sket{\spsi}\shat{x} + \smathbf{J}_y\sket{\spsi}\shat{y} + \smathbf{J}_z\sket{\spsi}\shat{z},$$\nwhich is a 3-vector of kets! This is a very weird object, and not one that we're going to use at any point (nor, in fact, can I think of any use for it off the top of my head).\n\nFinally, we can see why $\svec{\smathbf{J}}$ has no eigenvalues. If you act on a state $\sket{\spsi}$, you get back a ''3-vector'' of kets, as above. This is a different mathematical object from the ket we started with, but it could still be an eigenvalue equation if we had\n$$\svec{\smathbf{J}}\sket{\spsi} = (j_x\shat{x} + j_y\shat{y} + j_z\shat{z})\sket{\spsi},$$\nbut this would require $\sket\spsi$ to be a //simultaneous// eigenstate of $\smathbf{J}_x$, $\smathbf{J}_y$, and $\smathbf{J}_z$. This is impossible because those three operators don't commute with each other, and so the Robertson uncertainty relation tell us that they can't all be well-defined (i.e., have a value) for the same state.\n\nI'll conclude by mentioning that the other vector operator you might see is $\svec{\smathbf{x}}$, or $\svec{\smathbf{p}}$. These are defined as\n$$\svec{\smathbf{x}} = \smathbf{x}\shat{x}+\smathbf{y}\shat{y}+\smathbf{z}\shat{z},$$\nand the same for $\svec{\smathbf{p}}$. In this case the three operators //do// commute, and so these vector operators //do// have eigenstates; they're simultaneous eigenstates of all three component operators.
The interesting thing about Berry's phase -- the $e^{-i\sdelta p\sdelta x/\shbar}$ in \n$$\shat{A}_{-\sdelta p} \shat{S}_{-\sdelta x} \shat{A}_{\sdelta p} \shat{S}_{\sdelta x} = e^{-i\sdelta p\sdelta x/\shbar}\sId$$\nis ''precisely'' that it isn't a property of any of the four operators. We might very well expect that if we apply four operators (e.g., $A$, then $B$, then $C$, then $D$), and the net result is to pick up a phase, then at least one of those operators must have caused a change in phase. However, this is simply not the case! The phase is not a property of any translation, but of the entire loop that they describe in phase space.\n\nThere are two ways to see this.\n# The phase we pick up is proportional to the *area* of the loop around which we move in phase space. Each operator is parametrized by a length -- i.e., a distance in phase space. If the phase was coming from the individual operators, then it would have to scale with the length of the path -- i.e., the perimeter of the loop. But as we make the loop larger, the phase is proportional to the area, which scales as the square of the perimeter.\n# If the phase was produced by the individual operators, then we could presumably change the operators by multiplying each one by a phase that would cancel out the net phase that we pick up as we go around a loop. Indeed, for a particular loop we can do this... but there's no way to do this for *all* loops. That is, we cannot add a phase to the operators that cancels out the Berry phase for all loops that can be produced by those operators.\n\nSo this is truly a collective effect -- the product of four operators yields a phase which *cannot* be thought of as coming from any individual operator.\n\nThe general phenomenon is called "geometric phase", and it occurs in both classical and quantum system. The Aharonov-Bohm effect is a quantum manifestation, and Foucault's Pendulum is a classical example. You can find more information at http://en.wikipedia.org/wiki/Geometric_phase .\n\n''Followup question'': "Sure, the phase isn't produced by any single operator -- but we could sum up contributions due to all the operators, right? Also, suppose we took a path that is not closed. What is the Berry phase for an //open// loop -- i.e., a sequence of translations that doesn't bring us back to the origin? Surely it can't be zero right up until the loop is closed!"\n\nThe key phrase is "sum up contributions due to all the operators". It's very natural to imagine that the total phase is the sum of the contributions of individual operators... but it's wrong! This is what makes Berry's phase interesting.\n\nThere are four individual operators here. Let's just call them $A$, $B$, $C$, and $D$. Now, their product $ABCD$ is equal to $e^{i\stheta}\sId$. If the total phase $theta$ was a sum of terms for the four operators, then we'd have\n$$\stheta = \stheta_A + \stheta_B + \stheta_C + \stheta_D.$$\nNow, suppose we make all the translations twice as big. So instead of $A$, we apply $A^2$. Now we have $AABBCCDD$, and that is still proportional to $\sId$, but with a different phase: $$AABBCCDD = e^{i\sphi}\sId$$\n''If'' the total phase was a sum of terms, then we should pick up twice as much of it -- i.e., we should get $\stheta_A$ twice, $\stheta_B$ twice, etc. We should get\n$$\sphi = 2(\stheta_A + \stheta_B + \stheta_C + \stheta_D) = 2\stheta.$$\n\nHowever, this is not what happens. The total phase is proportional to the //area// of the loop, so in fact\n$$\sphi = 4\stheta,$$\nbecause this loop has four times the area of the first one.\n\nNow, regarding "suppose we took a path that was not closed...", it's not really right to say "just as the two end points touched each other a phase would appear." Rather, the notion of a phase is only defined when the final state and the initial state are the same *except* for a phase. If I were to ask "What is the relative phase between $\sket\suparrow$ and $\sket\srightarrow$?", it's a meaningless question. Similarly, "What is the relative phase between $\sket\suparrow$ and $-\sket\srightarrow$?" is meaningless. However, I //can// talk about "the relative phase between $\sket\suparrow$ and $-\sket\suparrow$," which is -1.\n\nPerhaps a good analogy is to consider an operator $A$ and ask "What is the eigenvalue of a general vector $\sket\spsi$?" This is meaningless -- most vectors are not eigenvectors of $A$, and therefore have no eigenvalue. Even if $\sket\spsi$ is very close to an eigenvector, it still has no eigenvalue. But as soon as $\sket\spsi$ becomes an eigenvector, it suddenly has an eigenvalue.\n\nSimilarly, Berry's phase is a property only of closed loops. If we consider a sequence of translations forming an open loop -- i.e., one that does not leave the final state equal to the initial state except for a phase -- then there is no well-defined Berry's phase.
Indeed, Born's Rule is the same (mathematically) no matter how we\nwrite it. The physical content does not change. Both ways, i.e.\n\sbegin{eqnarray}\nP(\sphi_j|\spsi)~ &=& |\sbraket{\sphi_j}{\spsi}|^2 \s\s\nP(\sphi_j|\spsi)~ &=& \sTr\sleft( \sproj{\sphi_j}\sproj{\spsi} \sright)\n\send{eqnarray}\nare correct, fine, and identical. Unrolling the first expression to\ngive the second form, however, has two effects (both of which you've\nnoted). First, it looks much more like the classical bilinear formula\nfor the probability of an event. Second -- and much more\nsignificantly -- it demonstrates that if we move our calculation to a\nnew vector space (that of operators, rather than that of pure states),\nwe have a bilinear expression for probability. If we embrace this\nwholeheartedly, then we find ourselves with a new tool that we can do\na *lot* with! That tool is the vector space of operators, and its\nassociated Hilbert-Schmidt inner product, and in this space we can use\nBorn's Rule to:\n\n1. Build Hermitian operators that represent observable quantities.\n2. Build density matrices that represent mixed quantum states.\n3. Build measurements beyond those represented by sets of orthogonal bras.\n\nWe have already seen the usefulness of #1 -- without it, we would have\nno objects representing observables, and therefore no way to make a\nconnection with classical physics (as we are doing in the current\nlectures). #2 has been used by brute force so far -- i.e., I just\nshowed how to construct such states, rather than making a compelling\ncase for why it is *necessary*, but we will see in Lectures 25-26 that\nit is mandated by the theory! #3 has really not been used at all, but\nwill make an appearance at the same time.\n\nSo, in summary: both forms are fine and useful, but the linearized or\n"unrolled" form involving the trace is a crucial conceptual step\nbecause it shows why the vector space of operators is somewhere we\nneed to go.
The most important things to know are:\n\n# The difference between inner products between:\n## pairs of physical states, e.g. $\sbraket{\spsi}{\sphi}$: (yields a probability amplitude)\n## a physical state and an eigenvector of X or P, e.g. $\sbraket{\spsi}{x'}$,$\sbraket{p'}{\spsi}$: (yields a "wavefunction" $\spsi(x)$ or $\stilde\spsi(p)$, whose square can be interpreted NOT as a probability, but as a p.d.f. or probability distribution function $p(x)$, such that $p(x)\sdiff x$ is a measure)\n## an eigenvector of X or P and another eigenvector of X or P, e.g. $\sbraket{x}{x'}$ or $\sbraket{p}{p'}$ or $\sbraket{x'}{p'}$: (yields a distribution, and one of the two indices, e.g. $x$ or $x'$ in\n$\sbraket{x}{x'}$ MUST be left unspecified so that we can integrate over it).\n\n# The motivation and definition of $\sPhi$, the physical states\n## What states are in $\sPhi$? What are some states that aren't?\n## Why, and how, did we define $\sPhi$ this way? (//Hint: it's __not__ "so that we have a place for $\sket{x}$ and $\sket{p}$". That was a free bonus.//)\n## What are $\sPhi^'$ and $\sPhi^x$, what sort of things do they contain, and why are they bigger than $\sPhi$ or $\smathcal{H} = L^2(\sreals)$?\n\nThe derivations of WHY $\sbraket{x}{x'} = \sdelta(x-x')$ and $\sbraket{p}{p'} = \sdelta(p-p')$ and WHY $\sbraket{x'}{p'} = \sfrac{1~}{\ssqrt{2\spi\shbar}~} \sexp( ix'p'/\shbar )$ are good reading, but don't need to be in your RAM.
# The first 3 problems are entirely classical. I highly recommend reviewing the slides for Lecture 3. Some of these words -- in particular "observable" and "observation" -- have very precise meanings. Don't just guess what they mean based on your experience in every day life!\n# In particular, make sure that you know the difference between (a) an observable, (b) an observation, and (c) an _outcome_ of an observation. Here's a hint: with respect to a flipped coin, "Heads" is neither an observable nor an observation.\n# In problem 2(c) "write it as a convex combination of 2 other states" does NOT mean "...as a convex combination of two of the other states in this problem". It means ANY states.\n# The answers to 2(a,d,e,f) are entirely mathematical. In other words, if you find yourself writing down sentences or phrases as the answer, you are doing it wrong. You can obviously use words to explain what you're doing, or to try for partial credit if you think you're getting it wrong, but the ANSWER is a mathematical object.\n# In 2(f), a good physical model for the observation is this: Instead of looking at the Trie after it is thrown, you ask your friend Guido to look at it and tell you the answer. Guido privately flips a fair coin: if it's heads he looks at the Trie and tells you the truth; if it's tails, he just makes up an answer at random (uniformly distributed over {1,2,3}) and tells it to you. If you're still confused, a good way to start is to ask "What if Guido *never* looked at the coin? What would the indicator functions be then?"\n# In 3(c,e), your guess is not random. It is the best possible guess -- so, for instance, if you were certain that the Trie was in the 1 state then you would obviously always guess "1".\n# In 3(c,e,f), you will have to use Bayes' Rule to calculate the state of something //after// you make a measurement, as one step in the problem.\n# In 3(e) you are being asked to guess what kind of Trie I used, not what face was showing.\n# Question 4(e) is very hard. Don't kill yourself trying to figure it out completely. You may wish to investigate the Poincarรฉ recurrence theorem, as it will provide some guidance.
The notes, homeworks, and website are in TiddlyWiki markup, and use the jsMath package to display LaTeX math using JavaScript. LaTeX uses \snewcommand{} to define macros; jsMath does not use this syntax, but has an equivalent feature. In my jsMathPlugin page on this website, I've defined a series of macros using jsMath's syntax:\n\n{{{\njsMath.Macro('ket','\s\sleft|#1\s\sright\s\srangle',1);\njsMath.Macro('bra','\s\sleft\s\slangle#1\s\sright|',1);\njsMath.Macro('braket','\s\sleft\s\slangle#1|#2\s\sright\s\srangle',2);\njsMath.Macro('braopket','\s\sleft\s\slangle#1|#2|#3\s\sright\s\srangle',3);\njsMath.Macro('ketbra','\s\sleft|#1\s\sright\s\srangle\s\s!\s\s!\s\sleft\s\slangle#2\s\sright|',2);\njsMath.Macro('proj','\s\sleft|#1\s\sright\s\srangle\s\s!\s\s!\s\sleft\s\slangle#1\s\sright|',1);\njsMath.Macro('Tr','\s\smathrm{Tr}');\njsMath.Macro('Id','1\s\s!\s\smathrm{l}');\njsMath.Macro('expect','\s\sleft\s\slangle#1\s\sright\s\srangle',1);\njsMath.Macro('diff','\s\smathrm{d}');\njsMath.Macro('eps','\s\svarepsilon');\njsMath.Macro('mat','\s\sbegin{pmatrix}#1\s\send{pmatrix}',1);\njsMath.Macro('dd','\s\sfrac{\s\sdiff #1}{\s\sdiff #2}',2);\njsMath.Macro('pd','\s\sfrac{\s\spartial #1}{\s\spartial #2}',2);\njsMath.Macro('ha','{\s\ssmall \s\sfrac{1}{2}}');\njsMath.Macro('fr','{\s\ssmall \s\sfrac{#1}{#2}}',2);\njsMath.Macro('del','\s\svec{\s\snabla}');\njsMath.Macro('inarrow','\s\scurvearrowright');\njsMath.Macro('outarrow','\s\scurvearrowleft');\n}}}\n\nThe equivalent LaTeX macros are:\n\n{{{\n\snewcommand{\sket}[1]{\sleft| #1 \sright\srangle}\n\snewcommand{\sbra}[1]{\sleft\slangle #1 \sright|}\n\snewcommand{\sbraket}[2]{\sleft\slangle #1 | #2 \sright\srangle}\n\snewcommand{\sbraopket}[3]{\sbra{#1}#2\sket{#3}}\n\snewcommand{\sproj}[1]{| #1\srangle\s!\slangle #1 |}\n\snewcommand{\sTr}{\smathrm{Tr}}\n\snewcommand{\sId}{1\s!\smathrm{l}}\n\snewcommand{\sexpect}[1]{\sleft\slangle#1\sright\srangle}\n\snewcommand{\sdiff}{\smathrm{d}}\n\snewcommand{\seps}{\svarepsilon}\n\snewcommand{\smat}[1]{\sbegin{pmatrix}#1\send{pmatrix}}\n\snewcommand{\sdd}[2]{\sfrac{\sdiff #1}{\sdiff #2}}\n\snewcommand{\spd}[2]{\sfrac{\spartial #1}{\spartial #2}}\n\snewcommand{\sha}{{\ssmall \sfrac{1}{2}}}\n\snewcommand{\sfr}[2]{{\ssmall \sfrac{#1}{#2}}}\n\snewcommand{\sdel}{\svec{\snabla}}\n\snewcommand{\sinarrow}{\scurvearrowright}\n\snewcommand{\soutarrow}{\scurvearrowleft}\n}}}
Okay, I fixed the bugs in the macros. They should compile now.\n\nMore precisely, they //do// compile. Here's [[Homework #2 in LaTeX|pdf/Homework2.tex]] (note: the goofy numbering on problem 4 is fixed), and here's [[the pdf|pdf/Homework2.pdf]] that {{pdflatex}} produces.\n\nI'm not planning to do this as a regular thing (unless I get a hold of a script that will translate TiddlyWiki into LaTeX), but I figured one example was worth it.
Good question. There's an unavoidable merging of notation between fields here.\n\n# In the context of probability and measure theory, a "probability distribution" is a probability measure -- i.e., a measure that assigns value 1 to the entire sample space. Since the measure is unambiguously $p(x)\sdiff x$, in this context "probability distribution" = $p(x)\sdiff x$. The function $p(x)$ is then a "probability distribution function" or p.d.f.\n# In the context of distribution theory, a distribution is a generalized function -- so a function $f(x)$ can act as a distribution, but so can something like $\sdelta(x)$ that is not a function. However (as I tried to convey in my discussion of "equal except on a set of measure zero"), the meaning of the distribution is as a linear functional on [test] functions. So, when I write $\sdelta(x)$, I really mean an object like:\n\n$$ \sint{ \sdelta(x) [\s \s \s \s \s ] \sdiff x} $$\n\nwhere the brackets with the big hole in between them are meant to represent a slot where you plug in a test function. Anyway, in this context a "probability distribution" is a distribution which has the properties:\n (a) It maps any strictly nonnegative function to a strictly nonnegative real number.\n (b) It maps the unit function (1) to the number 1.\nThis is a distribution that is also a probability measure... ergo, "probability distribution".\n\nSo, I suppose the ideal solution is to never use the word "probability distribution" in the latter context, perhaps substituting "distribution that is a probability measure" or maybe "probability measure distribution," but it's just so tempting to go with the obvious -- but potentially confusing -- lingo.\n\nAs I may have alluded, some things about the whole formalism irritate me. One -- as a guy who works with probability theory a lot -- is that there are all these "$\sdiff x$"s wandering around implicitly, inside the inner product for instance. There are reasons to privilege Lebesgue measure like that, but I don't like doing so in such an implicit way... most people forget that $\spsi(x)$ and $|\spsi(x)|^2$ only have meaning when you stick a dx on the latter!\n
Let our Hilbert space be $\smathcal{H} = L^2(\sreals)$. Then, recall that we defined $\sPhi \ssubset \smathcal{H}$ as the largest subset of $\smathcal{H}$ that is closed under the $\shat{x}$ and $\shat{p}$ operators. This turns out to be the set of Schwartz functions -- functions $f(x)$ that are smooth ($C^\sinfty$) and rapidly decreasing ($lim_{|x|\sto\sinfty}x^nf(x)=0$ for all $n\sin\smathbb{N}$). This is a vector space, but not a Hilbert space, because although it has an inner product (inherited from $\smathcal{H}$), it is not complete (there are convergent sequences in $\sPhi$ that converge to nonsmooth and/or non-rapidly-decreasing functions).\n\nNow, $\sPhi'$ is the [[dual space]] to $\sPhi$, defined as the space of all valid linear functionals on $\sPhi$ -- i.e., linear maps $\sphi: \sPhi \sto \scomplex$. Since $\sPhi$ is a subset of $\smathcal{H}$, $Phi'$ clearly includes everything in $\smathcal{H}'$, the dual to $\smathcal{H}$. Therefore, $\sPhi'\ssupseteq\smathcal{H}'$, and it's definitely bigger than $\sPhi$. This is possible because $\sPhi$ is not a Hilbert space, so the Riesz theorem doesn't hold.\n\n//Technical point:// You might well ask why we're talking about $\sPhi'$ and $\smathcal{H}'$ here instead of using the notation $\sPhi^*$ and $\smathcal{H}^*$. The answer is embarrassingly technical: $\smathcal{H}'$ is the __continuous__ dual space to $\smathcal{H}$, while $\smathcal{H}^*$ is the __algebraic__ dual space to $\smathcal{H}$, which contains some nasty pathological [[discontinuous linear maps|http://en.wikipedia.org/wiki/Discontinuous_linear_map]]. In finite dimensions, this distinction doesn't exist, but in infinite dimensions we only want the continuous linear functionals.\n\nAnyway, it's also easy to show that $\sPhi'$ is a strict superset of $\smathcal{H}$. A nice example is to take $\sket\spsi\sin\smathcal{H}$ such that $\sket\spsi\snot\sin\smathcal{D}(\shat{x})$ -- i.e., a state $\sket\spsi$ such that $\shat{x}$ maps it out of $\smathcal{H}$. Now, $\sbra{\spsi}\shat{x} = (\shat{x}\sket\spsi)^\sdagger$ is not in $\smathcal{H}'$, but it __is__ in $\sPhi'$. Why? Because if $\sket\sphi\sin\sPhi$, then $\sbraopket{\spsi}{\shat{x}}{\spsi}$ is well-defined, because we can operate the $\shat{x}$ to the right and get something that is still in $\sPhi$ and therefore in $\smathcal{H}$.\n\nIn fact, $\sPhi'$ is a lot bigger than $\smathcal{H}$, and it includes the eigenkets of $\shat{x}$ and $\shat{p}$, e.g. $\sket{x'}$ and $\sket{p'}$. This can be shown by brute force -- for any Schwartz function, $\sbraket{x}{\spsi} = \spsi(x)$ is well-defined, and the same for $\sket{p}$. But, recall, this wasn't //why// we built $\sPhi$ -- it sort of came along for free.\n\nSo that's the story of $\sPhi'$, and then $\sPhi^x$ and $\sPhi^*$ are pretty easy. We borrow the anti-isomorphism between bras and kets that takes $\smathcal{H}\srightarrow\smathcal{H}'$, and use it to define a bra $\sbra\sphi$ for every $\sket\sphi\sin\sPhi$. This space of "well-behaved linear functionals" is $\sPhi^*$, which is really an abuse of notation (but... whatever!). Now, we define $\sPhi^x$ as the dual space to $\sPhi^*$... and since $\sPhi^*$ contains bras, $\sPhi^x$ is a space of kets. Furthermore, because $\sPhi$ is anti-isomorphic to $\sPhi^*$, we can conclude that $\sPhi^x$ is anti-isomorphic to $\sPhi'$. It just contains all the kets for the bras in $\sPhi'$.
! Why the website WikiText doesn't compile directly in LaTeX.\n\nUnfortunately, there is no good solution right now for the following\nproblem: "Write math-rich documents in a standard way that (1)\nproduces professional-quality hardcopy, and (2) works effectively on\nthe web." We're getting there, and in a few years I expect it will\nexist, but right now it doesn't. Ideally, I'd be able to write in\nLaTeX and have it appear nicely on the web *without any syntax\ntweaks*. Can't do it.\n\nThe web site uses TiddlyWiki, which (a) is a wiki package; and (b)\ninterprets LaTeX math. Unfortunately, it doesn't\ninterpret all of LaTeX, or even a dense subset. It doesn't actually\ndo //any// of LaTeX except for the math tags, and uses its own wiki\nmarkup for everything else.\n\nGiven the necessity of choosing between (a) writing everything in\nLaTeX and putting up a host of PDF documents, or (b) writing\neverything in TiddlyWiki markup, which can be converted to LaTeX\nwithout too much difficulty, or just printed out... I chose the\nlatter. Why? Basically, because for the purposes of this class, it\ngives me a lot of power. I can write hyperlinked lecture notes and\nput a lot of content up on the website fast, which PDF files are\npoorly suited for.\n\nTherefore, it's unfortunately impossible for directly compile the\nsource code from TiddlyWiki in LaTeX. It would be pretty easy to\nwrite a script -- e.g. in Perl or Python or even YACC -- that\ntranslates TiddlyWiki markup into LaTeX (the reverse is not completely\npossible; LaTeX is much more powerful than TiddlyWiki markup, so there\nare things that simply don't translate). I haven't got the time to do\nit right now, but I'm hoping to get somebody to do it in the near\nfuture (anyone? That would probably max out your participation score...)\n\nIn the mean time, if you really want to copy the assignment into a\nLaTeX document, you'll have to do the translation on your own.\nPersonally, I suggest that you just print out the assignment from the\nwebsite, and write only your ''answers'' in LaTeX. I convert TiddlyWiki\nnotes to PDF by printing them, and using the "Print to PDF" option\nthat comes standard on Mac OS X. I believe there's a plugin for\nWindows that enables a similar functionality.\n\n! How to translate\n\nIf you ''do'' want to convert TiddlyWiki markup to LaTeX, I can give you\na quick list of the relevant commands. Here's a starter list; there\nare more, but I'm in a hurry right now.\n\n1. Header lines in TiddlyWiki look like this:\n{{{\n! Section Heading\n!! Subsection Heading\n!!! Subsubsection Heading\n}}}\n\nIn LaTeX, you would write \n{{{\n\ssection{Section Heading}\n\ssubsection{Section Heading}\n\ssubsubsection{Section Heading}\n}}}\n\n2. Italics in TiddlyWiki are: {{{//italicized text//}}}\nIn LaTeX, this would be {{{ \semph{italicized text} }}}\n\n3. Bold face in TiddlyWiki is {{{''bold text''}}}. NOTE: those marks are two adjacent single quotes, NOT a single double quote!\nIn LaTeX, this is {{{ \stextbf{bold text} }}}\n\n4. Bullet lists in TiddlyWiki look like:\n{{{\n* First item\n* Second item\n** Indented list begins\n** Second line of indented list\n* Third item\n}}}\nThe equivalent LaTeX would be\n{{{\n\sbegin{itemize}\n\sitem First item\n\sitem Second item\n\sbegin{itemize}\n\sitem Indented list begins\n\sitem Second line of indented list\n\send{itemize}\n\sitem Third item\n\send{itemize}\n}}}\n\n5. Enumerated lists in TiddlyWiki are exactly the same, but with {{{#}}} instead of {{{*}}}\n\nEnumerated lists In LaTeX are exactly the same, but replace {{{ {itemize} }}} with {{{ {enumerate} }}}.\n\n6. LaTeX ignores single newlines and white space -- in other words\n{{{\nSeveral adjacent\nlines of text\nwith lots of spaces\nin\nbetween.\n}}}\n\nwill come out of LaTeX as\n\n{{{\nSeveral adjacent lines of text with lots of spaces in between.\n}}}\n\nTiddlyWiki does not ignore newlines and white space:\n\nSeveral adjacent\nlines of text\nwith lots of spaces\nin\nbetween
Ah. You're confusing (very understandably!) two different potential meanings of "observation":\n1. "An observation (or measurement) is a physical process during which you see one of several possible outcomes."\n2. "An observation is what you observed, i.e. a particular outcome."\n\nThe correct meaning (at least in this course, and in the context of physics and quantum mechanics) is the first. When I put atoms through a Stern-Gerlach, that is an observation. When I look at a thrown die, that is an observation. Each of these has several possible outcomes. When I specify the observation, I *do not* specify which outcome happened, nor do I single out one outcome as being special in any way. Mathematically, we represent an [[observation]] in classical theory by a set of [[indicator function]]s, one corresponding to each possible outcome of the observation. If you prefer (this is almost equivalent, just a bit more restrictive), an observation can be thought of as a set of [[event]]s.\n\nIt is incorrect, in this context, to think of an observation as a single event. Yes, in the end, we observe a single event -- but this is a different meaning of the word "observe" from the one we're focusing on, which is that we observe a property. So the observation corresponds to an entire property (e.g. "Which side of the coin is up?"), and therefore to a *set* of [possible] events, not to a particular event.\n\nBohr famously (well, it's famous to me) remarked that our language is fundamentally ill-suited to quantum theory. We do not, and perhaps cannot, have words that simultaneously make sense to us and describe (precisely) what is going on in the quantum world. Usually we end up with words like "wavefunction" or "measurement" that are borrowed from classical physics and, therefore, have pre-existing meanings to us which *don't* accurately represent what's going on in quantum physics. Anyway, as your question and my answer show, the problem extends beyond quantum mechanics -- human languages are ill-suited to describing mathematics and physics precisely. Often -- as in this case -- we are forced to pick a gauge, by choosing exactly one of several meanings that a word has in everyday life. \n\nI hope that clarified exactly what gauge we are picking in this class, and which meaning of "observation" we are using.\n
var n = document.createElement("link"); \nn.rel = "shortcut icon"; \nn.href = "favicon.ico"; \ndocument.getElementsByTagName("head")[0].appendChild(n);
! Mathematical issues in QM\nQuantum state discrimination (unambiguous, minimum-error, maximum-confidence)\nDistinguishability of quantum states\n--Mutually unbiased bases--\n--Quantum t-designs--\n--Symmetric informationally complete POVM--\nTwirling\n--Discrete phase space (+Discrete Wigner function)--\n--Quantum reference frame--\n\n! Just plain quantum mechanics\nLoschmidt echo (related to quantum chaos)\nQuantum chaos\n--Quantum state tomography--\nQuantum process tomography (+ancilla-assisted process tomography)\nQuantum nanomechanical devices (nanomechanical resonator)\n\n! Foundations of QM\nHardy's "Five reasonable axioms"\nThe causaloid formalism\nEnvariance\nContextuality (see Kochen-Specker Theorem)\nLeggett-Garg inequalities\n--The Spekkens toy model--\nWeak measurement\n\n! Decoherence\nQuantum Darwinism (article on wikipedia is weak)\nPointer states (in decoherence)\nQuantum trajectories\nMaster equations in QM (wikipedia has no quantum-specific entry)\nQuantum Brownian motion\n\n! Quantum information and communication\nQuantum communication capacity (of a quantum channel)\nClassical communication capacity of a quantum channel\nQuantum repeater\nNo-broadcasting theorem (+broadcasting of quantum states)\nOptimal quantum cloning\nThe additivity conjectures\n\n! Quantum optics\nHusimi (Q) representation + Glauber (P) representation\nQuantum optical phase operator [Pegg-Barnett; PRA 39 #4 p.1665 (1989); J. Phys. A 19 3849 (1986); JMO v. 36 #1, p.7 (1989; best starting point) ]\n--Optical phase space--\nCavity QED\nSpin-coherent states\nParametric downconversion of photons (stub)\n\n! Quantum computation\nCluster state (stub)\nGottesman-Knill Theorem (needs expansion on wikipedia)\nStabilizer circuit\nQuantum threshold theorem (fault tolerant quantum computation)\nTransversal gate (in fault-tolerant quantum computation)\nNoiseless subsystem\n--Decoherence-free subspace--\nStabilizer code\nCSS code (Calderbank-Shor-Steane)\n
<<<\n1. The Nature of States\n* (a) Look up (in a dictionary, encyclopedia, Teh Interwebs, etc.) the words "epistemic" and "ontological" (see also "ontic"). Based on this research, write 1-5 sentence definitions of the two words, as used in the context "This state of a physical system is an {ontological/epistemic} state.". ''Use your own words.'' You can certainly paraphrase other sources, but if a Google search for the text of your answer turns up a [nearly] exact match, I'll be forced to assume that you plagiarized.\n<<<\n\nEpistemic: Relating to knowledge. An //epistemic state// is one that describes knowledge about a physical system, generally someone's knowledge (which may well be incomplete) about that system's properties and how it will behave.\n\nOntic or Ontological: Relating to reality. An //ontological state// is one describing how a system really is -- what its properties are, independent of whether anybody knows them or not.\n\n<<<\n* (b) In each of the following sentences, state (& explain briefly) whether the word "state" is being used in the sense of "ontological state" or "epistemic state".\n** (i) "The state of this coin is `heads facing up'."\n<<<\n\nThis is both an ontological state (it says what the coin's property //is//) and an epistemic state (it could describe my knowledge, if I knew everything about the coin). Full credit for "ontological" or "both".\n\n<<<\n** (ii) "The state of this classical particle moving along a line is: $\sleft\s{x=2.0\spm0.2\smathrm{\s cm;\s }p=0\spm0.1\smathrm{\s }\sfrac{\smathrm{g}\scdot\smathrm{cm}}{\smathrm{s}}\sright\s}\s,$."\n<<<\n\nThis is an epistemic state. Classical particles have well-defined positions and momenta, but this state describes uncertainty and therefore must be someone's knowledge of the particle.\n\n<<<\n** (iii) "The state of the $10^{23}$ nitrogen atoms in this 1 liter box is: they are in thermal equilibrium at exactly 1 atmosphere of pressure and exactly 273 degrees Kelvin."\n<<<\n\nThis is an epistemic state because it describes vast uncertainty about the atoms' individual degrees of freedom. Had I said "..of the gas in this box..." it would have been ambiguous, because pressure, volume, and temperature might be taken as a complete specification of the rather vague system "gas" -- but they do not specify the true, detailed microstate of the $10^{23}$ atoms. Thermodynamic states are //always// epistemic -- this is why it's called "statistical mechanics".\n\n<<<\n** (iv) "The state of this silver atom's angular momentum is $\sket\spsi = \smat{ 1 \s\s 0 }~$ in the basis $\s{\sket{\suparrow},\sket{\sdownarrow}\s}$ defined by measuring $J_z$ with a Stern-Gerlach."\n<<<\n\nAmbiguous. There's no consensus on whether quantum pure states are epistemic or ontological! Full credit for anything but a blank answer (the point was to make you think about it...).\n\n<<<\n** (v) "The state of this silver atom's angular momentum is $\sket\spsi = \smat{ \sfrac{1}{\ssqrt2}\s, \s\s \sfrac{1}{\ssqrt2}\s, }$ in the same basis as (iv)."\n<<<\n\nAmbiguous for the same reason as above. However, your answer here must agree with the answer to (iv) to get credit -- these are both pure states and there is no difference in their status.\n\n<<<\n** (vi) "The state of this silver atom's angular momentum is $\srho = \sfrac12\sleft(\sproj{\suparrow} + \sproj{\sdownarrow}\sright)~$."\n<<<\n\nThis is an epistemic state. Quantum mixed states are, like classical probabilistic states, convex mixtures of pure states, and represent uncertainty as to which pure state describes the system. However, if you //explicitly// argued that the state could be ontological //if// the silver atom was known to be entangled with another system (something we haven't covered yet!), then you get full credit.\n\n
<<<\n2. Probabilistic descriptions of physical systems:\nSuppose I invent a 3-sided die, or "Trie". Its sides are labeled "1", "2", and "3", and we will be interested //only// in the side that is facing up when it is thrown. I manufacture four different versions -- called Alpha, Beta, Gamma, and Delta -- each with a different bias. Extensive testing shows that, when each model is thrown many times in the same way:\n** When Alpha is thrown, each side shows up equally often.\n** When Beta is thrown, "1" appears three times as often as "3", and "2" appears twice as often as "3".\n** When Gamma is thrown, "1" and "2" appear equally often and "3" never shows up.\n** When Delta is thrown, "3" appears every time.\n* (a) For each of the four models, suppose that I throw it so that you cannot see how it lands. Write down (for each model) the probabilistic state describing your knowledge of how it lies.\n<<<\n\nI will write each state as a probability vector; other representations are permissible.\n$$ \svec{P}_\salpha = \smat{\sfrac13 \s\s \sfrac13 \s\s \sfrac13};\s \s \svec{P}_\sbeta = \smat{\sfrac12 \s\s \sfrac13 \s\s \sfrac16};\s \s \svec{P}_\sgamma = \smat{\sfrac12 \s\s \sfrac12 \s\s 0};\s \s \svec{P}_\sdelta = \smat{0 \s\s 0 \s\s 1} $$\n\n<<<\n* (b) Draw and label the probability simplex for the Trie, and plot each of the four states from (a) on it.\n<<<\n\n[img[images/HW2/HW2Sols-Simplex.png]]\n\n<<<\n* (c) For each of the states in (a): //first//, state whether it is pure or mixed; //second//, either write down two different convex decompositions of the state (i.e., write it as a convex combination of 2 other states) or explain why this cannot be done.\n<<<\n$\svec{P}_\salpha$, $\svec{P}_\sbeta$, and $\svec{P}_\sgamma$ are all mixed; $\svec{P}_\sdelta$ is pure.\nThere are many ways to write each of the mixed states as a convex combination of other states. A convex decomposition with two terms must be of the form $\svec{P} = p\svec{Q}_1 + (1-p)\svec{Q}_2$, where $\svec{Q}_1$ and $\svec{Q}_2$ are valid probabilistic states. Here are some examples:\n* $\svec{P}_\salpha = \sfrac23\svec{P}_\sgamma + \sfrac13\svec{P}_\sdelta = \sfrac23\svec{P}_\sbeta + \sfrac13\smat{0\s\s \sfrac13 \s\s \sfrac23}$\n* $\svec{P}_\sbeta = \sfrac12\svec{P}_\salpha + \sfrac12\smat{\sfrac23 \s\s \sfrac13 \s\s 0} = \sfrac23\svec{P}_\sgamma + \sfrac13\smat{\sfrac12\s\s0\s\s \sfrac12}$\n* $\svec{P}_\sgamma = \sfrac12\smat{1\s\s 0\s\s 0}+\sfrac12\smat{0\s\s 1\s\s 0} = \sfrac14\smat{1 \s\s 0\s\s 0}+\sfrac34\smat{\sfrac13\s\s \sfrac23 \s\s 0}$\nHowever, $\svec{P}_\sdelta$ is a pure state, which means it cannot be written as a convex combination of any states //other// than itself. Since $\svec{P}_\sdelta$ itself is the only probabilistic state which assigns zero probability to "1" and "2", any convex combination involving any other state would have to assign nonzero probability to "1" or "2", and could not therefore yield $\svec{P}_\sdelta$. //Note: it is sufficient for full credit to observe that $\svec{P}_\sdelta$ is pure; the fuller explanation is not necessary.//\n\n<<<\n* (d) Let the Trie's sample space be indexed by a number $n\sin[1,2,3]$. Write down two different observables for the Trie, at least one of which is complete.\n<<<\n\nAgain, there are very many possibilities. Complete observables include $n$, $n^2$, $e^n$, etc. Incomplete observables include $1$, $n\smathrm{\s mod\s }2$, and $(-1)^n$.\n\n<<<\n* (e) For the complete observable you specified in (d), write down the corresponding observation as a set of indicator functions.\n<<<\n\nAny complete observable assigns different values to each value of $n$, and therefore the corresponding observation is the one that resolves each state perfectly. The indicator functions could be written using Kronecker delta notation as $\s{\sdelta_{n,1}, \sdelta_{n,2}, \sdelta_{n,3}\s}$ or as vectors in the dual space to probability space (//Note: -1 point if you write them as probability vectors; this is mostly right but misses the deep point that indicator functions live in the dual space//),\n$$\sleft\s{\smat{1 & 0 & 0}, \smat{0 & 1 & 0}, \smat{0 & 0 & 1}\sright\s},$$\nor in any other way of representing the correct set of $I_j(n)$ indicator functions.\n\n<<<\n* (f) Suppose that your procedure for performing the observation in (e) is flawed (perhaps you need an eye exam?). Half the time, it works, but half the time you get a uniformly random result (that bears no connection with the actual configuration of the Trie!), and you don't know when it fails. Write down this observation as a set of indicator functions.\n<<<\n\nI will write the indicator functions as vectors in the probability dual space:\n$$\sleft\s{\smat{\sfrac23 & \sfrac16 & \sfrac16}, \smat{\sfrac16 & \sfrac23 & \sfrac16}, \smat{\sfrac16 & \sfrac16 & \sfrac23}\sright\s}.$$\nThat these are the correct indicator functions follows from the observation that //if// the true state is $n$, then outcome "$n$" occurs with probability $\sfrac23$, and each of the others occurs with probability $\sfrac16$. Thus, the measurement "works" $\sfrac12$ of the time, but we get the right answer by random accident $\sfrac16 = \sfrac13\scdot\sfrac12$ of the time too. //Note: $\sfrac12$ credit if you wrote $\smat{\sfrac12 & \sfrac14 & \sfrac14}$ and permutations, thus getting the right idea but the wrong math.//\n
<<<\n3. Bayes' Rule, decisions, and convex combination\n* (a) For each of the models of Trie described in Problem 2, suppose that the corresponding Trie is thrown, and then you make the observation described in 2(f). What is -- in each of the four cases -- the probability distribution of the outcomes?\n<<<\nI will label the the outcome by "j", and write the probability distribution of the outcomes in each case as a probability vector:\n* Alpha: $P_{obs}(j) = \svec{I}_j\scdot\svec{P}_\salpha$, so $\svec{P}_{obs} = \smat{\sfrac13 \s\s \sfrac13 \s\s \sfrac13}$.\n* Beta: $\svec{P}_{obs} = \smat{\sfrac{5}{12} \s\s \sfrac13 \s\s \sfrac14}$.\n* Gamma: $\svec{P}_{obs} = \smat{\sfrac{5}{12} \s\s \sfrac{5}{12} \s\s \sfrac16}$.\n* Delta: $\svec{P}_{obs} = \smat{\sfrac16 \s\s \sfrac16 \s\s \sfrac23}$.\n\n<<<\n* (b) In each of the cases of (a), suppose you get the result corresponding to "1". Write down (for each model) the probabilistic state describing your knowledge of how the Trie lies //after// the observation.\n<<<\n\nIn each case we apply Bayes' Rule, which says that after observing outcome $k$ described by indicator function $I_k(n)$, we update to $P'(n) \spropto P_0(n)I_k(n)$, and normalize.\n* Alpha: $\svec{P}' = \smat{\sfrac23 \s\s \sfrac16 \s\s \sfrac16}$\n* Beta: $\svec{P}' = \smat{\sfrac45 \s\s \sfrac{2}{15} \s\s \sfrac{1}{15}}$\n* Gamma: $\svec{P}' = \smat{\sfrac45 \s\s \sfrac15 \s\s 0}$\n* Delta: $\svec{P}' = \smat{0 \s\s 0 \s\s 1}$\n\n<<<\n* (c) For each of the cases in (a), you are asked to make your best guess (maximizing probability of guessing correctly) as to how the Trie lies. What is the probability that you guess correctly, both: (i) //before// the observation, and (ii) //after// the observation? For which models did the observation help you? (//Note: ignore part (b) here; do __not__ assume that you get result "1"//).\n<<<\n\n(i) Before the observation, we have nothing but our //prior// knowledge of the probabilities of the various values of $n$ to go on. The best guess is the most probable value, and the probability that we guess correctly is simply its probability. Thus:\n* Alpha: $P_{correct} = \sfrac13$\n* Beta: $P_{correct} = \sfrac12$\n* Gamma: $P_{correct} = \sfrac12$\n* Delta: $P_{correct} = 1$\n\n(ii) We already determined the probabilities $P_{obs}(j)$ of the various outcomes $j$ in part (a). For each of the three possible outcomes $j$, we then apply Bayes' Rule to update our probability distribution to $P(n|j)$. The best guess is the //mode// of the resulting state -- i.e., the $n'$ such that $P(n'|j)>P(n|j)$ for all other $n$. The probability that this guess is correct is just $P(n'|j) = \smax_n{P(n|j)}$. Thus, the total probability of guessing correctly is the average of this quantity over all the possible outcomes $j$, or\n$$P_{correct} = \ssum_j{P_{obs}(j)\smax_n{P(n|j)}}$$.\n* Alpha: This one is easy; each outcome $j$ is equally probable, and no matter which one we get, the best guess is $n=j$, which is correct with probability $\sfrac23$. Thus $P_{correct} = \sfrac23$.\n* Beta: Working through Bayes' Rule, we find again that the best guess is $n=j$, and this guess is correct with probabilities $P(1|1)=\sfrac45$, $P(2|2)=\sfrac23$, and $P(3|3)=\sfrac49$. Thus $P_{correct}=\sfrac{5}{12}\sfrac45 + \sfrac13\sfrac23+\sfrac14\sfrac49 = \sfrac23$.\n* Gamma: In this case, if we observe "1" or "2" then the best guess is $n=j$ and this guess is correct with probability $P(1|1)=P(2|2)=\sfrac45$. If we observe "3" then we //know// the measurement has failed, and we fall back on guessing either "1" or "2", getting it right with probability $\sfrac12$. Thus $P_{correct} = 2\sfrac{5}{12}\sfrac45 + \sfrac16\sfrac12 = \sfrac34$.\n* Delta: The measurement is pointless; we already know $n=3$, so $P_{correct}=1$.\n\n* The observation helps in all cases except the pure state (Delta), where we already have complete knowledge.\n\n<<<\n* (d) I fill an urn with 100 Tries -- 50 "Beta" models and 50 "Gamma" models -- then pull one out at random and throw it so that you cannot see either which //kind// of Trie it is, or how it falls. Write down a probabilistic state describing your knowledge of how it lies.\n<<<\n\n\nI picked a Beta or a Gamma Trie with probability $\sfrac12$ each, so the state describing the unknown Trie is a convex combination:\n$$\svec{P} = \sfrac12\svec{P}_\sbeta + \sfrac12\svec{P}_\sgamma = \smat{\sfrac12 \s\s \sfrac{5}{12} \s\s \sfrac{1}{12}}$$\n<<<\n* (e) You are asked to make an observation on the Trie thrown in (d), and then to decide whether I selected a Beta or a Gamma model. Write down an observation (as a set of indicator functions) //with only two outcomes// that achieves the highest achievable probability of guessing correctly, and state the probability of guessing correctly both before and after the observation.\n<<<\n\nThere are two possibilities: either I picked a Beta or a Gamma. You can describe your initial knowledge of this by a probability distribution over the sample space $G = \s{\sbeta, \sgamma\s}$, and this distribution is $\svec{p_0} = \smat{\sfrac12 \s\s \sfrac12}$. The goal of an observation is to help you decide which is //more// probable, //given// the observation -- which is a job for Bayes' Rule.\n\nAn observation of $n$ tells you something about whether I picked Beta or Gamma, because each outcome corresponds to an indicator function on the sample space G:\n$$\svec{I}_n = \smat{ P_\sbeta(n) & P_\sgamma(n) }$$.\nThis is a little tricky, and deserves some careful thought -- the elements of the indicator function do //not// form a probability distribution, but they do represent the relative probabilities of observing $n$ if the thrown Trie was a Beta or Gamma model (respectively).\n\nThe full observation of $n$ thus corresponds to indicator functions on G:\n$$\sleft\s{ \smat{ \sfrac12 \s\s \sfrac12 }, \smat{ \sfrac13 \s\s \sfrac12 }, \smat{ \sfrac16 \s\s 0 } \sright\s}.$$\nSince your prior knowledge is unbiased -- i.e., $\svec{p_0} = \smat{\sfrac12 \s\s \sfrac12}$ -- your state $\svec{p}'$ //after// an observation of $n$ is determined completely by the corresponding indicator function, and is in fact proportional to it. Thus, for each value of $n$ observed, you should guess "Beta" if $P_\sbeta(n) > P_\sgamma(n)$, and "Gamma" if $P_\sgamma(n) > P_\sbeta(n)$... and if the two are equal, then you can guess either way.\n\nIn other words, you do not need to observe $n$; you merely need to observe the sign function $\smathrm{sgn}(P_\sbeta(n)-P_\sgamma(n))$, which is +1 if its argument is non-negative and -1 if it is negative (//Note: the sign function can also be defined to take values $\s{\spm1,0\s}$, but this definition is more convenient here.//) This is not a complete observable -- it has two indicator functions, one corresponding to all the values of $n$ where $P_\sbeta(n) \sgeq P_\sgamma(n)$, and the other corresponding to the values where $P_\sbeta(n) < P_\sgamma(n)$. In this particular case, this observation is\n$$ \s{ \svec{I}_\sbeta = \smat{1 & 0 & 1}, \svec{I}_\sgamma = \smat{0 & 1 & 0} \s},$$\nwhich means that you are going to guess "Beta" if you get $n=1$ or $n=2$, and "Gamma" if you get $n=3$. However, since $P_\sgamma(1) = P_\sbeta(1)$, we can guess either way if we get $n=1$, or even guess randomly, so any observation \n$$ \s{ \svec{I}_\sbeta = \smat{p & 0 & 1}, \svec{I}_\sgamma = \smat{1-p & 1 & 0} \s},$$\nis a valid solution!\n\nThe probability of guessing correctly is\n$$ P_{correct} = P(\sbeta)P(I_\sbeta|\sbeta) + P(\sgamma)P(I_\sgamma|\sgamma) = \sfrac12\sleft(P_\sbeta(1)+P_\sbeta(3)\sright) + \sfrac12P_\sgamma(2) = \sfrac{7}{12}.$$\n\n<<<\n* (f) Generalize part (e) to an arbitrary $n$-state system, where I may have chosen to prepare either state $\svec{p} = \s{p_1\sldots p_n\s}$ or $\svec{q} = \s{q_1\sldots q_n\s}$, with equal //prior probabilities// for $\svec{p}$ and $\svec{q}$. You can perform any observation on a single sample of the unknown state; show that the highest probability of guessing correctly is given by $P_{correct} = \sfrac12 + \sfrac14\ssum_n{|p(n)-q(n)|}$.\n<<<\n\nMost of the reasoning is in the solution to (e). We could simply measure $n$, and then we would guess that $\svec{p}$ was prepared if we got a value of $n$ for which $p(n) > q(n)$, and that $\svec{q}$ was prepared if we got a value of $n$ for which $q(n) > p(n)$. If we get one for which $q(n)=p(n)$ then it doesn't matter which we guess; we have a 50% chance either way.\n\nThis can be represented, again, as a 2-outcome measurement of the observable $\smathrm{sgn}(p(n)-q(n))$. The probability of success is given by\n$$ P_{correct} = P(\svec{p})P(I_{\svec{p}}|\svec{p}) + P(\svec{q})P(I_{\svec{q}}|\svec{q}) = \sfrac12\sleft(P(I_{\svec{p}}|\svec{p}) + P(I_{\svec{q}}|\svec{q})\sright),$$\nand a moment's thought shows that this can be written as\n$$ P_{correct} = \sfrac12\ssum_n{\smathrm{max}[p(n),q(n)]}.$$\nThis is a fairly nice looking result, but we can improve it by using the following rather cute identity:\n$$ \smathrm{max}[p(n),q(n)] = \sfrac{p(n)+q(n)}{2} + \sfrac{|p(n)-q(n)|}{2}, $$\nwhich follows from the fact that $p(n)$ and $q(n)$ are spaced equally far from their mean. So we plug that in and recall that $\ssum_n{p(n)} = \ssum_n{q(n)} = 1$ and obtain\n$$P_{correct} = \sfrac12 + \sfrac14\ssum_n{|p(n)-q(n)|}$$\nThe quantity $\ssum_n{|p(n)-q(n)|}$ is called the ''1-norm'' of the vector $\svec{p}-\svec{q}$. It not only has a nice interpretation here as the //distinguishability// of two probabilistic states, but has a nice generalization to the comparable quantum problem!\n
<<<\n4. Diagonalizing observables and unitary matrices.\nSuppose $A$ is a Hermitian matrix (i.e., a representation of an operator $\smathbf{A}$ on a finite-dimensional Hilbert space). Let us call the basis in which $A$ is written $\s{\sket{1},\sket{2},\sldots\sket{n}\s}$, so $A_{ij} \sequiv \sbraopket{i}{\smathbf{A}}{j}$.\n* (a) The Spectral Theorem implies that $A$ has a complete set of eigenkets $\s{\sket{a_i}\s}$ with eigenvalues $\s{a_i\s}$. Write down a matrix $U$ such that $UAU^\sdagger$ is diagonal.\n<<<\n\nIf we choose $U^\sdagger$ to be the matrix whose columns are the eigenvectors of $A$, $U$ will diagonalize $A$. Recall that kets are column vectors, so we write\n$$U^\sdagger = \ssum_k{\sketbra{a_k}{k}} \sLongrightarrow U = \ssum_k{\sketbra{k}{a_k}}$$\nTo prove that this works, note that\n$$UAU^\sdagger = \ssum_{k,l}{\sketbra{k}{a_k}A\sketbra{a_l}{l}} = \ssum_{k,l}{a_l\sket{k}\sbraket{a_k}{a_l}\sbra{l}} = \ssum_{k,l}{a_l\sdelta_{k,l}\sketbra{k}{l}} = \ssum_{k}{a_k\sproj{k}},$$\nso there are no off-diagonal elements.\n\n<<<\n* (b) Prove that $U$ is //unitary//.\n<<<\n\nA matrix $U$ is unitary if $U^\sdagger U = \sId$. In this case, since $U = \ssum_k{\sketbra{k}{a_k}}$,\n$$U^\sdagger U = \ssum_{k,l}\sketbra{a_k}{k}\sketbra{l}{a_l} = \ssum_{k,l}\sdelta_{k,l}\sketbra{a_k}{a_l} = \ssum_{k}\sketbra{a_k}{a_k} = \sId,$$\nwhere in the last step we've used the fact that the sum of the projectors onto any basis is equal to the identity operator.\n\n<<<\n* (d) Write $UAU^\sdagger$ in Dirac notation. Your answer should make it clear what the diagonal elements are.\n<<<\n\nFrom part (a), $UAU^\sdagger = \ssum_{k}a_k\sproj{k}$. The diagonal elements are the eigenvalues $a_k$.\n\n<<<\n* (c) The //commutator// of two matrices $A$ and $B$ is defined as $[A,B] = AB-BA$. Prove that $U$ commutes with its adjoint -- i.e., $[U,U^\sdagger]=0$.\n<<<\n\n$\sleft[U,U^\sdagger\sright] = UU^\sdagger - U^\sdagger U$. We already proved that $U^\sdagger U = \sId$, and now we observe that\n$$UU^\sdagger = \ssum_{k,l}\sketbra{k}{a_k}\sketbra{a_l}{l} = \ssum_{k,l}\sdelta_{k,l}\sketbra{k}{a} = \ssum_{k}\sketbra{k}{k} = \sId,$$\nso $\sleft[U,U^\sdagger\sright] = UU^\sdagger - U^\sdagger U = \sId - \sId = 0$.\n\n<<<\n* (d) A matrix that commutes with its adjoint is called //normal//, and the Spectral Theorem can be proved for all normal matrices. Use this to prove that the eigenvalues of $U$ are $\s{e^{i\stheta_k}\s}$, where the $\stheta_k$ are real numbers.\n<<<\n\n//Proof:// By the Spectral Theorem, $U$ has eigenvalues $u_k$ and corresponding eigenvectors $\s{\sket{u_k}\s}$. We can therefore construct a unitary $V$ so that $D = VUV^\sdagger$ is diagonal with diagonal entries $\s{u_k\s}$. Therefore, $U = V^\sdagger D V$ and $U^\sdagger = V^\sdagger D^\sdagger V$. Using these identities, we can write\n$$\sId = U^\sdagger U = V^\sdagger D V V^\sdagger D^\sdagger V = V^\sdagger D D^\sdagger V$$,\nand by multiplying by $V$ and $V^\sdagger$ on the left and right (resp) of both sides, we get \n$$D D^\sdagger = \sId.$$\nSince $D$ and $D^\sdagger$ are diagonal, the diagonal entries of $D D^\sdagger$ are $u_k^*u_k$, and since the previous equation implies they are also equal to 1, we have $u_k^* u_k=1$, which means that $u_k = e^{i\stheta_k}$ for some real numbers $\stheta_k$. QED.\n\n<<<\n* (e) The //infinity norm// of an operator $X$, denoted $||X||_\sinfty$, is the absolute value of $X$'s largest eigenvalue. It is a good measure of the overall magnitude of an operator (e.g., if $||X||_\sinfty=0$, then $X=0$). Prove the following theorem: For any real number $\sepsilon>0$, there is an integer $n>0$ such that $||U^n-\sId||_\sinfty < \sepsilon$.\n<<<\n\n[>img[images/HW2/HW2Sols-phases.png]]\nThis is a hard problem! -- mathematically quite a bit more sophisticated than anything else in this set. Because the time evolution of quantum systems is represented by a family of unitary operators like this, we're basically proving a quantum equivalent of the [[Poincare recurrence theorem|http://en.wikipedia.org/wiki/Poincare_recurrence]] -- i.e., that if a quantum system is allowed to run for long enough, then it will return to its starting state.\n\nLet us work in the eigenbasis of $U$, since it has one, and let $d$ be the dimension of $U$. Both $U$ and $\sId$ are thus diagonal ($\sId$ is diagonal in every basis), and $U$ has eigenvalues $e^{i\stheta_k}$ for $k=1\sldots d$. Furthermore, $U^n$ has eigenvalues $e^{in\stheta_k}$. We are going to regard these eigenvalues, $\s{e^{in\stheta_k}\s}$, as dynamical variables that evolve as we increase $n$. We are trying to prove that there is some $n$ such that $||U^n-\sId||_\sinfty < \sepsilon$, so note that\n$$||U^n-\sId||_\sinfty = \smathrm{max}_k\sleft(|e^{in\stheta_k}-1|\sright) = \smathrm{max}_k\sleft(\sleft|\ssin\sleft(\sfrac{n\stheta_k}{2}\sright)\sright|\sright).$$\nSince $|\ssin(x)|\sleq|x|$ for all real $x$, the condition is satisfied if, for all $k$, $n\stheta_k\smathrm{\s mod\s }2\spi < 2\sepsilon$.\n\nIf we relax the condition that $n$ be an integer, and let $n$ be a positive real number, then the phases $\s{n\stheta_k\smathrm{\s mod\s }2\spi\s}$ of $U^n$ for any $n$ represent a point on a $d$-torus. Furthermore, as $n$ increases from zero, they trace out a trajectory on the torus (see Figure 1), which wraps around whenever one of the $n\stheta_k$ passes an integer multiple of $2\spi$. This is the trajectory of the dynamical map $\s{\sphi_k\s} \srightarrow \s{\sphi_k + n\stheta_k\s}$ (it should be obvious that if we start with $\s{\sphi_k=0\s}$, then applying this map for integer $n$ gives the phases of $U^n$). Note also that this dynamical map is //linear// in the $\stheta_k$, so it preserves the shape of regions on the torus (see Figure 2).\n\n[>img[images/HW2/HW2Sols-phases2.png]]\nWe wish to show that for some $n$ (and all $k$), $\sleft(n\stheta_k\smathrm{\s mod\s }2\spi\sright) < 2\sepsilon$. This is equivalent to the point $\s{n\stheta_k\smathrm{\s mod\s }2\spi\s}$ lying within a hypercube of sidelength $2\sepsilon$ centered at the origin (red square in Fig. 1) -- which, in turn, is exactly equivalent to the existence of //overlap// between a hypercube of sidelength $\sepsilon$ centered at $\s{n\stheta_k\smathrm{\s mod\s }2\spi\s}$ and another hypercube of sidelength $\sepsilon$ centered at the origin (green squares in Fig. 2).\n\nThe key to the proof is the fact that the entire torus has volume $(2\spi)^d$, and each hypercube of sidelength $\sepsilon$ centered at $\s{n\stheta_k\smathrm{\s mod\s }2\spi\s}$ has volume $\sepsilon^d$. As we increase $n$, we litter the torus with more and more of these regions (see Fig. 2), and eventually we run out of room! If $n>\sleft(\sfrac{2\spi}{\sepsilon}\sright)^d$, then there must exist some $n'\sleq n$ and some $m\sleq n$ such that the hypercubes around $\s{n'\stheta_k\smathrm{\s mod\s }2\spi\s}$ and $\s{m\stheta_k\smathrm{\s mod\s }2\spi\s}$ overlap.\n\nSince the dynamical map is linear (//note: the weaker condition of "area-preserving" is sufficient//), this implies that the hypercubes for $n'-1$ and $m-1$ also overlap, and by induction the hypercube for $N = n'-m$ overlaps the one at the origin -- which means that $N\stheta_k\smathrm{\s mod\s }2\spi < 2\sepsilon$. QED.
<<<\n1. The Quantum Zeno Effect\nConsider an idealized Stern-Gerlach apparatus, and let the silver atoms' angular momentum state be described by a state in a 2-dimensional Hilbert space with basis kets $\sket\suparrow$ and $\sket\sdownarrow$ representing "spin up" and "spin down", respectively, along the $\shat{z}$ axis.\n* (a) Suppose we want to prepare a beam of silver atoms in the $\sket{\suparrow}$ state. How (briefly) could this be done?\n<<<\n\nPass a beam of randomly prepared atoms (e.g. from an oven) through a Stern-Gerlach apparatus oriented along the $\shat{z}$ axis, which will split it into two beams. Take the top beam and discard the bottom one.\n\n<<<\n* (b) The beam from (a) is fed into a Stern-Gerlach apparatus measuring $J_x$. We observe that there are two possible outcomes to the measurement, corresponding to $J_x = \spm\sfrac{\shbar}{2}$. Write down (in terms of the base kets above): (i) a set of bras corresponding to the measurement of $J_x$; (ii) the measurement itself as a set of projectors; (iii) the operator representing the observable $\smathbf{J_x}$; (iv) the probabilities of the outcomes $\spm\sfrac\shbar2$ when silver atoms prepared as in (a) are fed in.\n<<<\n\n(i) From various sources including class lectures, we know that atoms in the $\sket\suparrow$ or $\sket\sdownarrow$ states will be found to have $J_x = \spm\sfrac{\shbar}{2}$ with equal probability of each sign. Therefore, the bras $\s{\sbra{\sleftarrow},\sbra\srightarrow\s}$ corresponding to this measurement must be //unbiased// with respect to the $\s{\sket\suparrow,\sket\sdownarrow\s}$ states -- that is, must satisfy $|\sbraket{\sleftarrow}{\suparrow}|^2 = |\sbraket{\srightarrow}{\suparrow}|^2 = |\sbraket{\sleftarrow}{\sdownarrow}|^2 = |\sbraket{\srightarrow}{\sdownarrow}|^2 = \sfrac12$. In addition, to represent the outcomes of a repeatable measurement they must be orthogonal -- i.e., $\sbraket{\sleftarrow}{\srightarrow}=0$. These constraints mean that\n$$\sbra\sleftarrow = \sfrac{1}{\ssqrt2}\sbra\suparrow + \sfrac{e^{i\sphi}}{\ssqrt2}\sbra\sdownarrow\smathrm{;\s \s }\sbra\srightarrow = \sfrac{1}{\ssqrt2}\sbra\suparrow - \sfrac{e^{i\sphi}}{\ssqrt2}\sbra\sdownarrow.$$\nBy convention, we choose $\sphi=0$ (but other choices of $\sphi$ are not, in principle, incorrect) to get\n$$\sbra\sleftarrow = \sfrac{1}{\ssqrt2}\sleft(\sbra\suparrow + \sbra\sdownarrow\sright)\smathrm{;\s \s }\sbra\srightarrow = \sfrac{1}{\ssqrt2}\sleft(\sbra\suparrow - \sbra\sdownarrow\sright).$$\n(//Note: all the explanation & derivation is not required for full credit//.)\n\n(ii) Using the notation above, the measurement is\n$$\sleft\s{ \sproj{\sleftarrow}, \sproj{\srightarrow} \sright\s}$$\nAlternatively, it could be represented explicitly in the $J_z$ basis as\n$$\sleft\s{ \sfrac12\smat{ 1 & 1 \s\s 1 & 1}, \sfrac12\smat{ 1 & -1 \s\s -1 & 1} \sright\s}.$$\n\n(iii) The observable $\smathbf{J_x}$ assigns a value (in this case, $\spm\sfrac{\shbar}{2}$) to each outcome of the measurement in (ii). By convention, the outcome represented by $\sbra\sleftarrow = \sfrac{1}{\ssqrt2}\sleft(\sbra\suparrow+\sbra\sdownarrow\sright)~$ is associated with $J_x=+\sfrac{\shbar}{2}$. Thus\n$$ J_x = \sfrac{\shbar}{2}\sproj{\sleftarrow} + \sleft(-\sfrac{\shbar}{2}\sright)\sproj{\srightarrow} = \sfrac{\shbar}{2}\sleft(\sproj\sleftarrow-\sproj\srightarrow\sright) = \sfrac{\shbar}{2}\smat{ 0 & 1 \s\s 1 & 0}$$\n\n(iv) From experiments (and lectures about them) we know that these probabilities are $p_{\spm} = \sfrac12$. However, we can check this using Born's Rule:\n\sbegin{eqnarray}\np_+ = |\sbraket{\sleftarrow}{\suparrow}|^2 &=& \sTr\sleft[\sproj\sleftarrow\sproj\suparrow\sright] = \sfrac12 \s\s\np_- = |\sbraket{\srightarrow}{\suparrow}|^2 &=& \sTr\sleft[\sproj\srightarrow\sproj\suparrow\sright] = \sfrac12 \s\s\n\send{eqnarray}\n\n<<<\n* (c) In a new experiment, the atoms from (a) are fed into a Stern-Gerlach apparatus oriented at angle $\sfrac{\spi}{4}$ between the $\shat{x}$ and $\shat{z}$ axes, which measures an observable we will call $\smathbf{J_{\spi/4}}$. (i) What are the probabilities for the various outcomes? (ii) write down the measurement as a set of projectors.\n<<<\n\n(i) From experiments (and the lectures), we know that when particles that have been measured along one axis (e.g. $\shat{z}$) are then measured along another axis that makes an angle $\stheta$ with the first, the probabilities for the second measurement are $p_+ = \scos^2\sfrac{\stheta}{2}$, $p_- = \ssin^2\sfrac{\stheta}{2}$. Since in this case $\stheta=\sfrac{\spi}{4}$,\n$$p_+ = \scos^2\sfrac{\spi}{8} = \sfrac{2+\ssqrt2}{4}\smathrm{;\s \s }p_- = \ssin^2\sfrac{\spi}{8} = \sfrac{2-\ssqrt2}{4}$$\n\n(ii) The probabilities from (i) mean that the bras for this measurement, $\sbra{\snwarrow}$ and $\sbra{\ssearrow}$, must satisfy $|\sbraket{\snwarrow}{\suparrow}|^2 = \scos^2\sfrac{\spi}{8}$ and $|\sbraket{\ssearrow}{\suparrow}|^2 = \ssin^2\sfrac{\spi}{8}$. Also, $\sbraket{\snwarrow}{\ssearrow}=0$. Therefore\n\sbegin{eqnarray}\n\sbra{\snwarrow}~ &=& \scos\sfrac{\spi}{8}\sbra{\suparrow} + e^{i\sphi}\ssin\sfrac{\spi}{8}\sbra{\sdownarrow} \s\s\n\sbra{\ssearrow}~ &=& \ssin\sfrac{\spi}{8}\sbra{\suparrow} - e^{i\sphi}\scos\sfrac{\spi}{8}\sbra{\sdownarrow}\n\send{eqnarray}\nTo figure out what the phase $\sphi$ must be, we recall that this measurement setting not only makes an angle of $\sfrac{\spi}{4}$ with the $\shat{z}$ axis but also with the $\shat{x}$ axis. Thus, the equations above should also hold if we replace $\sket{\suparrow}$ with $\sket{\sleftarrow}$. This is only true if $\sphi=0$, so\n\sbegin{eqnarray}\n\sbra{\snwarrow}~ &=& \scos\sfrac{\spi}{8}\sbra{\suparrow} + \ssin\sfrac{\spi}{8}\sbra{\sdownarrow} \s\s\n\sbra{\ssearrow}~ &=& \ssin\sfrac{\spi}{8}\sbra{\suparrow} - \scos\sfrac{\spi}{8}\sbra{\sdownarrow}\n\send{eqnarray}\nThe measurement projectors are thus\n$$ \sleft\s{ \sproj{\snwarrow},\sproj{\ssearrow}\sright\s} = \sleft\s{ \smat{ \scos^2\sfrac{\spi}{8} & \ssin\sfrac{\spi}{8}\scos\sfrac{\spi}{8} \s\s \ssin\sfrac{\spi}{8}\scos\sfrac{\spi}{8} & \ssin^2\sfrac{\spi}{8} }, \smat{ \ssin^2\sfrac{\spi}{8} & -\ssin\sfrac{\spi}{8}\scos\sfrac{\spi}{8} \s\s -\ssin\sfrac{\spi}{8}\scos\sfrac{\spi}{8} & \scos^2\sfrac{\spi}{8} } \sright\s}$$\n\n<<<\n* (d) After the measurement in (c), all the atoms are recombined into a single beam. Write down its state.\n<<<\n\nAfter a repeatable measurement yielding outcome $j$ represented by projector $\sproj{j}$, the measured system will be found in the state $\sket{j}$ (it must, otherwise the measurement would not be repeatable). Thus, after the measurement in (c), a fraction $p_+$ of the atoms are in the state $\sket{\snwarrow}$ and the remaining $p_-$ are in the state $\sket{\ssearrow}$. Thus, in the recombined beam, //some// of the atoms are in one pure state, and //some// are in another. We must describe the whole beam by a mixed state -- for which purpose we must represent the pure beams by projectors rather than by kets (combining kets by superposition yields a new ket, but as all kets are pure states, there is no ket representing the mixed state of the beam!)\n\sbegin{eqnarray}\n \srho &=& p_+\sproj{\snwarrow} + p_-\sproj{\ssearrow} \s\s\n &=& \smat{ \scos^4\sfrac{\spi}{8} + \ssin^4\sfrac{\spi}{8} & \sleft(\scos^2\sfrac{\spi}{8}-\ssin^2\sfrac{\spi}{8}\sright)\ssin\sfrac{\spi}{8}\scos\sfrac{\spi}{8} \s\s \sleft(\scos^2\sfrac{\spi}{8}-\ssin^2\sfrac{\spi}{8}\sright)\ssin\sfrac{\spi}{8}\scos\sfrac{\spi}{8} & 2\scos^2\sfrac{\spi}{8}\ssin^2\sfrac{\spi}{8} } \s\s\n &=& \smat{ \sfrac34 & \sfrac14 \s\s \sfrac14 & \sfrac14 }\n\send{eqnarray}\n\n<<<\n* (e) The beam from (d) is fed into a Stern-Gerlach apparatus measuring $J_x$ [a la part (b)]. What are the probabilities of the $\spm\sfrac\shbar2$ outcomes?\n<<<\n\nThis is a straightforward application of Born's Rule:\n$$p_+ = \sTr\sleft[ \srho\sproj{\sleftarrow}\sright] = \sfrac34\smathrm{;\s \s }p_- = \sTr\sleft[ \srho\sproj{\sleftarrow}\sright] = \sfrac14$$\nThe remarkable thing here is that if we had //not// measured $J_{\spi/4}$ in between, these probabilities would have been $\sfrac12$ each. By making a measurement in between $J_z$ and $J_x$, we sort of cushioned the randomizing effect of the change in measurement basis. This is the Quantum Zeno effect -- essentially, a watched system is likely to stay in the state in which it started, even if the measurement basis changes slowly.\n\n<<<\n* (f) In yet another experiment, we pass the beam from (a) through //three// consecutive Stern-Gerlach apparatuses, oriented at $\sfrac\spi6$, $\sfrac\spi3$, and $\sfrac\spi2$ to the $\shat{z}$ axis (so the last is, again, measuring $\smathbf{J_x}$). After each measurement, the beams are recombined. What are the probabilities for the final $\smathbf{J_x}$ measurement?\n<<<\n\nWe could work this out in full, gory, matrix detail -- but that wouldn't be any fun. So we'll be clever.\nEach measurement is represented by an angle, so the $n$th measurement is at $\stheta_n$. $\stheta_0 = 0$, $\stheta_1 = \sfrac{\spi}{6}$, etc. The $n$th measurement has two outcomes which we'll call $+_{\stheta_n}$ and $-_{\stheta_n}$, with corresponding projectors $\sproj{+_{\stheta_n}}$ and $\sproj{-_{\stheta_n}}$. If the probabilities of the outcomes are $p_n(+)$ and $p_n(-)$, then after we recombine the beams, the beam's state is\n$$\srho_n = p_n(+)\sproj{+_{\stheta_n}} + p_n(-)\sproj{-_{\stheta_n}}$$\nNow, at the beginning of the experiment (i.e., after the 0th measurement, which is in the $\shat{z}$ direction, to prepare the beam), we have $p_0(+)=1$ and $p_0(-)=0$, so\n$$\srho_0 = \sproj{+_0} = \sproj{\suparrow}.$$\nThe next measurement is at $\stheta_1 = \sfrac{\spi}{6}$, so given that the beam is in the $\sproj{\suparrow}$ state, the probabilities of $+_{\spi/6}$ and $-_{\spi/6}$ are just $p_1(+) = \scos^2\sfrac{\spi}{12}$ and $p_1(-) = \ssin^2\sfrac{\spi}{12}$. So\n$$\srho_1 = \scos^2\sfrac{\spi}{12}\sproj{+_{\spi/6}} + \ssin^2\sfrac{\spi}{12}\sproj{-_{\spi/6}}.$$\nThe next measurement is at $\stheta_2 = \sfrac{\spi}{3}$ -- but the critical thing is that it is at an angle of $\sfrac{\spi}{6}$ ''with respect to the previous measurement''. In other words, //if// the atoms are in the state $\sproj{+_{\spi/6}}$, then the probability of getting $+_{\spi/3}$ is $p_2(+) = \scos^2\sfrac{\spi}{12}$, and the probability of getting $-_{\spi/3}$ is $p_2(-) = \ssin^2\sfrac{\spi}{12}$. On the other hand, if the atoms are in the state $\sproj{-_{\spi/6}}$, then the probability of getting $+_{\spi/3}$ is $p_2(+) = \ssin^2\sfrac{\spi}{12}$, and the probability of getting $-_{\spi/3}$ is $p_2(-) = \scos^2\sfrac{\spi}{12}$. \n\nWe can actually model this using //classical// probability theory! Or, to be a bit more careful, once we've applied Born's Rule (a fundamentally quantum device) to find the probabilities for one measurement given the outcome of the previous measurement, the //rest// of the calculation is well-suited to probability theory. The state after the $n$th measurement is a convex combination of orthogonal states ($\sproj{+_{\stheta_n}}$ and $\sproj{-_{\stheta_n}}$), so we can represent it by a 2-element probability vector\n$$\svec{P}_n = \smat{ p_n(+) \s\s p_n(-) },$$\nand then the transition from time $n$ to time $n+1$ is a linear map -- a //stochastic map//, in fact -- on this probability vector. Going back to the discussion in the previous paragraph, we see that\n$$\svec{P}_{n+1} = \smat{ \scos^2\sfrac{\spi}{12} & \ssin^2\sfrac{\spi}{12} \s\s \ssin^2\sfrac{\spi}{12} & \scos^2\sfrac{\spi}{12} }\svec{P}_{n},$$\nwhere (even in homework solutions!) I have to emphasize again that despite the matrices, this is not a fundamentally QM calculation at this point; we're just calculating probabilities.\nNow the answer is immediately at hand, because after the 3rd measurement (which is at $\sfrac{\spi}{2}$)\n\sbegin{eqnarray}\n\svec{P}_3 &=& \smat{ \scos^2\sfrac{\spi}{12} & \ssin^2\sfrac{\spi}{12} \s\s \ssin^2\sfrac{\spi}{12} & \scos^2\sfrac{\spi}{12} }^3\svec{P}_{0} \s\s \n &=& \smat{ \sfrac{2+\ssqrt3}{4} & \sfrac{2-\ssqrt3}{4} \s\s \sfrac{2-\ssqrt3}{4} & \sfrac{2+\ssqrt3}{4}}^3\smat{1\s\s0} = \sfrac{1}{16}\smat{ 8+3\ssqrt3 \s\s 8-3\ssqrt3} \sapprox \smat{ 0.825 \s\s 0.175 }.\n\send{eqnarray}\nThese are the probabilities for a subsequent $J_x$ measurement, and since $J_x$ measurements are repeatable and we have been recombining the beams (rather than "collapsing" into one or the other), they are also the measurement for the final $J_x$ measurement itself. We notice that the probability of getting "+" is even higher than it was in the previous case -- multiple measurements at small angles to one another have enhanced the Zeno effect.\n\n<<<\n* (g) Consider the limit where we chain together $N$ Stern-Gerlach apparatuses, the $n$th of which is oriented at $\stheta = \sfrac{n\spi}{2N}$, and $N\srightarrow\sinfty$. What are the probabilities for the final $\smathbf{J_x}$ measurement?\n<<<\n\nWe developed the theory for this in the previous solution. Each apparatus is oriented at $\stheta = \sfrac{\spi}{2N}$ to the previous one, so the stochastic matrix that maps the probability distribution for one measurement to the next is\n$$\svec{P}_{n+1} = \smat{ \scos^2\sfrac{\spi}{4N} & \ssin^2\sfrac{\spi}{4N} \s\s \ssin^2\sfrac{\spi}{4N} & \scos^2\sfrac{\spi}{4N} }\svec{P}_{n},$$\nand so the final probability distribution is given by\n$$\svec{P}_N = \smat{ \scos^2\sfrac{\spi}{4N} & \ssin^2\sfrac{\spi}{4N} \s\s \ssin^2\sfrac{\spi}{4N} & \scos^2\sfrac{\spi}{4N} }^N\smat{1\s\s0}.$$\nNow, because $\sfrac{\spi}{4N}$ is very small, we can do a Taylor expansion in it:\n$$\svec{P}_N \sapprox \smat{ 1-\sleft(\sfrac{\spi}{4N}\sright)^2 & \sleft(\sfrac{\spi}{4N}\sright)^2 \s\s \sleft(\sfrac{\spi}{4N}\sright)^2 & 1-\sleft(\sfrac{\spi}{4N}\sright)^2 }^N\smat{1\s\s0}.$$\nNow, we can finish the approximation either using physical intuition or mathematical rigor:\n* As physicists, we look at this and observe that there are $N$ steps, and at each step we have a probability $\sfrac{\spi^2}{16N^2}$ of switching from "+" to "-". Therefore, the total probability of switching can be no greater than $\sfrac{\spi^2}{16N}$, which becomes zero as $N\srightarrow\sinfty$. The formal statement of this is the [[union bound|http://en.wikipedia.org/wiki/Boole%27s_inequality]].\n* More mathematically, we could observe that for small $\sepsilon$ and $\sepsilon'$,\n$$\smat{ 1-\sepsilon & \sepsilon \s\s \sepsilon & 1-\sepsilon}\smat{ 1-\sepsilon' & \sepsilon' \s\s \sepsilon' & 1-\sepsilon'} = \smat{ 1-(\sepsilon+\sepsilon') & (\sepsilon+\sepsilon') \s\s (\sepsilon+\sepsilon') & 1-(\sepsilon+\sepsilon')} + O(\sepsilon'\sepsilon),$$\nand by applying this repeatedly, we obtain\n$$\smat{ 1-\sepsilon & \sepsilon \s\s \sepsilon & 1-\sepsilon}^N = \smat{ 1-N\sepsilon & N\sepsilon \s\s N\sepsilon & 1-N\sepsilon } + O(\sepsilon^2)$$\nFor this problem, $\sepsilon = \sfr{\spi^2}{16N^2}$, so as $N\sto\sinfty$, $N\sepsilon\sto\sinfty$... and we come to the same conclusion, which is:\n\nIn the limit as $N\srightarrow\sinfty$, the the probability that the final measurement of $J_x$ finds $+\sleft(\sfrac{\shbar}{2}\sright)$ goes to 1. \n\n//Note: This is the full quantum Zeno effect. If we measured $J_x$ //immediately// after $J_z$, we would get a random outcome -- but changing the measurement basis slowly (//adiabatically//) causes the system's state to follow the measurement basis. This does //not// happen classically, and it shows that measurements change the system's state in interesting ways -- i.e., not necessarily in a randomness-inducing way, but possibly in a randomness-decreasing way. This effect is at the heart of a technique for quantum computing called //adiabatic quantum computation//.\n\n
<<<\n2. ''State and prove the Robertson Uncertainty Relation''\n<<<\n\nThe Robertson Uncertainty relation is: For any observables $A$ and $B$, and any state $\sket\spsi$,\n$$\sleft(\sDelta A\sright)^2\sleft(\sDelta B\sright)^2 \sgeq \sfrac14|\sexpect{[A,B]}|^2.$$\n\n''Proof:'' We start by defining $\sket\salpha = \sleft(A-\sexpect{A}\sright)\sket\spsi~$ and $\sket\sbeta = \sleft(B-\sexpect{B}\sright)\sket\spsi~$. By the [[Schwarz Inequality]], $\sbraket{\salpha}{\salpha}\sbraket{\sbeta}{\sbeta} \sgeq |\sbraket{\salpha}{\sbeta}|^2$. Therefore\n\sbegin{eqnarray}\n\sbraopket{\spsi}{(A^2-2\sexpect{A}A+\sexpect{A}^2)}{\spsi}\sbraopket{\spsi}{(B^2-2\sexpect{B}B+\sexpect{B}^2)}{\spsi} &\sgeq& \sleft|\sbraopket{\spsi}{(AB-\sexpect{A}B-A\sexpect{B}+\sexpect{A}\sexpect{B})}{\spsi}\sright|^2 \s\s\n\sleft(\sexpect{A^2}-\sexpect{A}^2\sright)\sleft(\sexpect{B^2}-\sexpect{B}^2\sright) &\sgeq& \sleft|\sexpect{AB}-\sexpect{A}\sexpect{B}\sright|^2\n\send{eqnarray}\nOn the R.H.S., we can write the operator $AB$ as \n$$AB = \sfrac12(2AB) = \sfrac12\sleft(AB - BA + AB + BA\sright) = \sfrac12\sleft([A,B] + \s{A,B\s}\sright),$$\nwhere $[A,B]$ is Hermitian, and $\s{A,B\s}$ is antihermitian. Therefore, the inequality becomes\n$$\sleft(\sDelta A\sright)^2\sleft(\sDelta B\sright)^2 \sgeq \sfrac14\sleft|\sexpect{[A,B]} + \sexpect{\s{A,B\s}-\sexpect{A}\sexpect{B}}\sright|^2.$$\nFinally, we note that the term on the right is the absolute square of the sum of a purely imaginary term $\sexpect{[A,B]}$ and a purely real term $\sexpect{\s{A,B\s}-\sexpect{A}\sexpect{B}}$, and that by dropping the real part entirely we cannot //increase// its magnitude, so we do so and get the Robertson uncertainty relation:\n$$\sleft(\sDelta A\sright)^2\sleft(\sDelta B\sright)^2 \sgeq \sfrac14\sleft|\sexpect{[A,B]}\sright|^2.$$\nQED.\n\n
<<<\n3. ''Unitary transformations and the Bloch sphere''\n* (a) Consider a spin-$\ssmall{\sfrac12}$ system, and let $\svec{n}$ be a __unit__ vector in $\sreals^3$. Write down the operator $\smathbf{J_n} = \smathbf{\svec{J}}\scdot\svec{n}$ as a matrix in the $\s{\sket\suparrow,\sket\sdownarrow\s}$ basis, and determine and write down the projectors onto its [normalized] eigenvectors $\sproj{+}$ and $\sproj{-}$ (corresponding, respectively, to the positive and negative eigenvalues).\n<<<\n\nLet $\svec{n} = n_x\shat{x}+n_y\shat{y}+n_z\shat{z}$, where $n_x^2+n_y^2+n_z^2=1$. Then \n$$\smathbf{J_n} = \smathbf{\svec{J}}\scdot\svec{n} = \sfrac{\shbar}{2}\sleft(n_x\ssigma_x + n_y\ssigma_y + n_z\ssigma_z\sright) = \sfrac{\shbar}{2}\smat{n_z & n_x-in_y \s\s n_x + in_y & -n_z}.$$\n\nThere are various ways to find the eigenstates of $\smathbf{J_n}$, but one of the most elegant is to define $\ssigma_n = \sfrac{2}{\shbar}\smathbf{J_n}$ and observe that $\ssigma_n^2 = \sId$,\n$$\ssigma_n^2 = \sleft(n_x^2\ssigma_x^2 + n_y^2\ssigma_y^2 + n_z^2\ssigma_z^2\sright) + \sleft[n_xn_y(\ssigma_x\ssigma_y+\ssigma_y\ssigma_x) + n_xn_z(\ssigma_x\ssigma_z+\ssigma_z\ssigma_x) + n_yn_z(\ssigma_y\ssigma_z+\ssigma_z\ssigma_y)\sright] = (n_x^2+n_y^2+n_z^2)\sId = \sId,$$\nand therefore, since $\sTr\ssigma_n=0$, it has eigenvalues $\spm1$. Therefore, the operators $\sfrac12(\sId\spm\ssigma_n)$ will be the projectors onto $J_n$'s eigenstates (which are the same as those of $\ssigma_n$ because one is a scalar multiple of the other):\n\sbegin{eqnarray}\n\sproj{+} &=& \sfrac12(\sId+\ssigma_n) = \sfrac12\smat{1+n_z & n_x-in_y \s\s n_x + in_y & 1-n_z} \s\s\n\sproj{-} &=& \sfrac12(\sId-\ssigma_n) = \sfrac12\smat{1-n_z & -n_x+in_y \s\s -n_x - in_y & 1+n_z} \s\s\n\send{eqnarray}\n\nAnother useful parametrization is in terms of spherical coordinates, where $n_x = \ssin\stheta\scos\sphi$, $n_y = \ssin\stheta\ssin\sphi$ and $n_z = \scos\stheta$. Using this parametrization,\n\sbegin{eqnarray}\n\sproj{+} &=& \sfrac12\smat{1+\scos\stheta & \ssin\stheta(\scos\sphi-i\ssin\sphi) \s\s \ssin\stheta(\scos\sphi+i\ssin\sphi) & 1-\scos\stheta} = \smat{\scos^2(\sfr{\stheta}{2}) & \shalf\ssin\stheta e^{-i\sphi} \s\s \shalf\ssin\stheta e^{i\sphi} & \ssin^2(\sfr{\stheta}{2})} \s\s\n\sproj{-} &=& \sfrac12\smat{1-\scos\stheta & -\ssin\stheta(\scos\sphi-i\ssin\sphi) \s\s -\ssin\stheta(\scos\sphi+i\ssin\sphi) & 1+\scos\stheta} = \smat{\ssin^2(\sfr{\stheta}{2}) & -\shalf\ssin\stheta e^{-i\sphi} \s\s -\shalf\ssin\stheta e^{i\sphi} & \scos^2(\sfr{\stheta}{2})},\n\send{eqnarray}\nand from this parametrization and the half-angle formula $\ssin\sleft(\sfr{\stheta}{2}\sright)\scos\sleft(\sfr{\stheta}{2}\sright) = \shalf\ssin\stheta$, it's clear that \n\sbegin{eqnarray}\n\sket{+} &=& \smat{\scos\sleft(\sfr{\stheta}{2}\sright) \s\s \ssin\sleft(\sfr{\stheta}{2}\sright)e^{i\sphi}} \s\s\n\sket{-} &=& \smat{\ssin\sleft(\sfr{\stheta}{2}\sright) \s\s -\scos\sleft(\sfr{\stheta}{2}\sright)e^{i\sphi}},\n\send{eqnarray}\nwhich is handy, though not required for this problem.\n\n<<<\n* (b) Suppose that the system's time evolution obeys the equation $\sdd{}{t}\sket\spsi = -\sfrac{i}{\shbar}\smathbf{J}_n\sket\spsi$. Write down the unitary operator $\smathbf{U}(t)$ that maps $\sket{\spsi(0)}$ to $\sket{\spsi(t)}$.\n<<<\n\nIt's sufficient for this problem to write $\smathbf{U}(t) = e^{-\sfrac{i}{\shbar}t\smathbf{J}_n}$. \n\nHowever, in part (d) below, we're going to need the matrix form of $\smathbf{U}(t)$, so we will compute it now. Note first that\n$$\sfrac{1}{\shbar}\smathbf{J}_n = \sfrac12\ssigma_n = \sfrac12\sleft(\sproj{+}-\sproj{-}\sright)$$\nin terms of the eigenstate projectors calculated in (a). Since the eigenvalues of this operator are $\spm\shalf$, the eigenvalues of $e^{-\sfrac{i}{\shbar}t\smathbf{J}_n}$ will be $e^{\smp it/2}$, and its eigenbasis will be the same as that of $\smathbf{J}_n$, giving\n$$\smathbf{U}(t) = e^{-it/2}\sproj{+} + e^{it/2}\sproj{-} = \smat{ \scos\sleft(\sfr{t}{2}\sright) - in_z \ssin\sleft(\sfr{t}{2}\sright) & -\ssin\sleft(\sfr{t}{2}\sright)(n_y+in_x) \s\s \ssin\sleft(\sfr{t}{2}\sright)(n_y-in_x) & \scos\sleft(\sfr{t}{2}\sright) + in_z \ssin\sleft(\sfr{t}{2}\sright) }.$$\n\n<<<\n* (c) Consider the set $\smathcal{T} = \s{\smathbf{U}(t)\s \sforall\s t\sgeq0~\s}~~$ from part (b). Prove that $\smathcal{T}$ is a [[group|http://en.wikipedia.org/wiki/Group_(mathematics)]]. Is it commutative ([[abelian|http://en.wikipedia.org/wiki/Abelian_group]]) or noncommutative (nonabelian)?\n<<<\n\nIf we write $\smathbf{U}(t)$ in the $\sket{+},\sket{-}$ basis, it is\n$$\smathbf{U}(t) = \smat{e^{-it/2} & 0 \s\s 0 & e^{it/2}}.$$\nTo show that $\smathcal{T} = \s{\smathbf{U}(t)\s \sforall\s t\sgeq0~\s}~$ is a group, we must show that:\n# $\smathcal{T}$ is closed under multiplication: __Proof__: clearly $\smathbf{U}(t_1)\smathbf{U}(t_2) = \smathbf{U(t_1+t_2)} \sin \smathcal{T}$,\n# the binary operation is associative: __Proof__: matrix multiplication is associative,\n# there is an identity element: __Proof__: $\smathbf{U}(0) = \sId$, and $\sId\smathbf{U}(t) = \smathbf{U}(t)$ for all $t$, and\n# for each $\smathbf{U}(t)$, there exists $\smathbf{U}(t)^{-1}\sin\smathcal{T}$ such that $\smathbf{U}(t)^{-1}\smathbf{U}(t)=\sId$: __Proof__: First, note that $\smathbf{U}(t+4\spi) = \smathbf{U}(t)$, so $\smathbf{U}(t) = \smathbf{U}(t\smathrm{\s mod\s }4\spi)$. Now, since $\smathbf{U}(4\spi)=\sId$, we can choose $\smathbf{U}(t)^{-1} = \smathbf{U}\sleft(4\spi-(t\smathrm{\s mod\s }4\spi)\sright)$, and this is always an inverse for $\smathbf{U}(t)$.\n\nSince all the $\smathbf{U}(t)$ are diagonal in the $\sket{+},\sket{-}$ basis, they all commute with each other, and so this group is abelian.\n\n<<<\n* (d) Suppose the system's initial state is $\sket{\spsi(0)} = \sket\suparrow$. Write down $\sexpect{\ssigma_x}$, $\sexpect{\ssigma_y}$, and $\sexpect{\ssigma_z}$ as functions of time. Derive a necessary and sufficient condition on $\svec{n}$ for there to exist a time $t$ when $\sbraket{\spsi(0)}{\spsi(t)}=0$.\n<<<\n\nSince $\sket{\spsi(t)} = \smathbf{U}(t)\sket{\spsi(0)}$, for an operator $\ssigma_j$,\n$$\sexpect{\ssigma_j}(t) = \sbraopket{\spsi(0)}{\smathbf{U}(t)^\sdagger\ssigma_j\smathbf{U}(t)}{\spsi(0)}.$$\nIn the standard basis we have been using, \n$$\sket{\spsi(t)} = \smat{ \scos\sleft(\sfr{t}{2}\sright) - in_z \ssin\sleft(\sfr{t}{2}\sright) & -\ssin\sleft(\sfr{t}{2}\sright)(n_y+in_x) \s\s \ssin\sleft(\sfr{t}{2}\sright)(n_y-in_x) & \scos\sleft(\sfr{t}{2}\sright) + in_z \ssin\sleft(\sfr{t}{2}\sright) } \smat{ 1 \s\s 0 } = \smat{ \scos\sleft(\sfr{t}{2}\sright) - in_z \ssin\sleft(\sfr{t}{2}\sright) \s\s \ssin\sleft(\sfr{t}{2}\sright)(n_y-in_x)},$$\nFrom this we can work out\n\sbegin{eqnarray}\n\sexpect{\ssigma_x} &=& \sbraopket{\spsi(t)}{\ssigma_x}{\spsi(t)} = 2\ssin\sleft(\sfr{t}{2}\sright))\sleft[n_y \scos\sleft(\sfr{t}{2}\sright) + n_x n_z \ssin\sleft(\sfr{t}{2}\sright)\sright] \s\s\n&=& n_y\ssin(t) + n_xn_z\sleft( 1-\scos(t) \sright) \s\s\n\sexpect{\ssigma_y} &=& \sbraopket{\spsi(t)}{\ssigma_y}{\spsi(t)} = 2\ssin\sleft(\sfr{t}{2}\sright)\sleft[-n_x \scos\sleft(\sfr{t}{2}\sright) + n_y n_z \ssin\sleft(\sfr{t}{2}\sright)\sright] \s\s\n&=& -n_x\ssin(t) + n_yn_z\sleft( 1-\scos(t) \sright) \s\s\n\sexpect{\ssigma_z} &=& \sbraopket{\spsi(t)}{\ssigma_z}{\spsi(t)} = \scos^2\sleft(\sfr{t}{2}\sright) + \sleft( n_z^2 - n_x^2 - n_y^2 \sright) \ssin^2\sleft(\sfr{t}{2}\sright) \s\s\n&=& \scos(t) + n_z^2\sleft( 1-\scos(t) \sright) = 1-2(1-n_z^2)\ssin^2\sleft(\sfr{t}{2}\sright)\n\send{eqnarray}\n\nNow, under what conditions will $\sbraket{\spsi(0)}{\spsi(t)} = 0$ at some time? This means that the system has evolved from $\sket\suparrow$ into an orthogonal state, which can only be $\sket\sdownarrow$. The geometric picture of the dynamics is that the Bloch sphere is rotating around the $\svec{n}$ axis, and in order for this rotation to (at any time) rotate the "North pole" of the Bloch sphere around to its bottom, $\svec{n}$ must lie in the equatorial plane -- i.e., $n_z = 0$. If $\svec{n}$ does lie in this plane, then a half-period rotation will rotate $\sket\suparrow$ around to $\sket\sdownarrow$. Thus the necessary and sufficient condition is $n_z=0$. (//Note: this physical argument is sufficient for full credit.//)\n\nWe can show this mathematically as well:\n$$0 = |\sbraket{\spsi(0)}{\spsi(t)}|^2 = |\scos\sleft(\sfr{t}{2}\sright) - in_z \ssin\sleft(\sfr{t}{2}\sright)|^2 = \scos^2\sleft(\sfr{t}{2}\sright) + n_z^2\ssin^2\sleft(\sfr{t}{2}\sright) = n_z^2 + (1-n_z^2)\scos^2\sleft(\sfr{t}{2}\sright)$$\nThis can be zero only if $n_z=0$, and if $n_z=0$, then $\sbraket{\spsi(0)}{\spsi(t)}=0$ when $t=\spi$. Thus $n_z=0$ is the necessary and sufficient condition. (//Note: this mathematical argument is sufficient for full credit.//)\n\n<<<\n* (e) Continuing the physical scenario of (d), calculate the exact value of $(\sDelta \smathbf{J}_n)^2(\sDelta \smathbf{J}_z)^2$ as a function of time. Compare it with the bound you get from the Robertson relation (also a function of time).\n<<<\n\nConveniently, $\ssigma_n^2 = \ssigma_z^2 = \sId$. Thus,\n\sbegin{eqnarray}\n(\sDelta \smathbf{J}_n)^2 &=& \sexpect{\smathbf{J}_n^2} - \sexpect{\smathbf{J}_n}^2 = \sleft(\sfrac{\shbar}{2}\sright)^2(1-\sexpect{\ssigma_n}^2) \s\s\n(\sDelta \smathbf{J}_z)^2 &=& \sexpect{\smathbf{J}_z^2} - \sexpect{\smathbf{J}_z}^2 = \sleft(\sfrac{\shbar}{2}\sright)^2(1-\sexpect{\ssigma_z}^2)\n\send{eqnarray}\nWe have already calculated $\sexpect{\ssigma_z}(t) = \scos(t) + n_z^2\sleft( 1-\scos(t) \sright)$. For $\sexpect{\ssigma_n}(t)$, something rather nice happens: because $\smathbf{U}(t) = e^{\sfr{it}{2}\ssigma_n}$ commutes with $\ssigma_n$,\n$$\sexpect{\ssigma_n}(t) = \sbraopket{\spsi(t)}{\ssigma_n}{\spsi(t)} = \sbraopket{\spsi(0)}{\smathbf{U}(t)^\sdagger\ssigma_n\smathbf{U}(t)}{\spsi(0)} = \sbraopket{\spsi(0)}{\ssigma_n}{\spsi(0)}$$\nwhich means that \n$$\sexpect{\ssigma_n}(t) = \sexpect{\ssigma_n}(0) = n_x\sexpect{\ssigma_x}(0) + n_y\sexpect{\ssigma_y}(0) + n_z\sexpect{\ssigma_z}(0) = n_z.$$\nSo\n\sbegin{eqnarray}\n(\sDelta \smathbf{J}_n)^2(\sDelta \smathbf{J}_z)^2 &=& \sleft(\sfrac{\shbar}{2}\sright)^4(1-n_z^2)\sleft(1-\sleft(1-2(1-n_z^2)\ssin^2\sleft(\sfr{t}{2}\sright)\sright)^2\sright) \s\s\n&=& 4\sleft(\sfrac{\shbar}{2}\sright)^4(1-n_z^2)^2\ssin^2\sleft(\sfr{t}{2}\sright) \sleft( \scos^2\sleft(\sfr{t}{2}\sright) + n_z^2\ssin^2\sleft(\sfr{t}{2}\sright) \sright) \s\s\n&=& \sleft(\sfrac{\shbar}{2}\sright)^4(1-n_z^2)^2 \sleft( \ssin(t)^2 + n_z^2(1-\scos(t))^2 \sright)\n\send{eqnarray}\nThe Robertson relation, on the other hand, gives\n$$\sleft(\sDelta \smathbf{J}_n\sright)^2\sleft(\sDelta \smathbf{J}_z\sright)^2 \sgeq \sfrac14|\sexpect{[\smathbf{J}_n,\smathbf{J}_z]}|^2.$$\nSince \n$$[\smathbf{J}_n,\smathbf{J}_z] = n_x[\smathbf{J}_x,\smathbf{J}_z] + n_y[\smathbf{J}_y,\smathbf{J}_z]+n_z[\smathbf{J}_z,\smathbf{J}_z] = 2i\sleft(\sfr{\shbar}{2}\sright)^2\sleft(n_yJ_x - n_xJ_y\sright),$$\nwe get \n\sbegin{eqnarray}\n\sleft(\sDelta \smathbf{J}_n\sright)^2\sleft(\sDelta \smathbf{J}_z\sright)^2 &\sgeq& \sleft(\sfr{\shbar}{2}\sright)^4|\sexpect{n_yJ_x + n_xJ_y}|^2 \s\s\n&=& \sleft(\sfr{\shbar}{2}\sright)^4\sleft| n_y\sleft(n_y\ssin(t) + n_xn_z\sleft( 1-\scos(t) \sright)\sright) + n_x\sleft(-n_x\ssin(t) + n_yn_z\sleft( 1-\scos(t)\sright)\sright) \sright|^2 \s\s\n&=& \sleft(\sfr{\shbar}{2}\sright)^4\sleft| sin(t)(n_x^2+n_y^2) \sright|^2 \s\s\n&=& \sleft(\sfr{\shbar}{2}\sright)^4\ssin^2(t)(1-n_z^2)^2\n\send{eqnarray}\nComparing the two expressions, we see that (1) the Robertson bound is indeed a lower bound (and simpler to calculate!), but (2) the true uncertainty is generally greater by the $n_z^2(1-\scos(t))^2$ term.
<<<\n4. ''The Pauli and Clifford Groups''\n* (a) Consider the operators $\s{\ssigma_x,\ssigma_z\s}\s,$ on a spin-$\sfrac12$ quantum system. Write down the group $P$ that they [[generate|http://en.wikipedia.org/wiki/Group_generator]]. What is the [[order|http://en.wikipedia.org/wiki/Order_of_a_group]] of $P$? How many elements are in its [[center|http://en.wikipedia.org/wiki/Center_of_a_group]]?\n<<<\n\nWe begin by noting that each of the two generators given is //idempotent//, a.k.a. it has order 2, a.k.a. its square is $\sId$. So we get nothing interesting by just taking powers of $\ssigma_x$ or $\ssigma_z$. If we multiply them, then we get two new elements: $\ssigma_x\ssigma_z = -i\ssigma_y$ and $\ssigma_z\ssigma_x = i\ssigma_y$. So far we have $\s{\ssigma_x,\ssigma_z,\sId,i\ssigma_y,-i\ssigma_y\s}$. Taking products of these elements, we get new elements $\ssigma_x\ssigma_z\ssigma_x = -\ssigma_z$ and $\ssigma_z\ssigma_x\ssigma_z = -\ssigma_x$. Obviously, $-\ssigma_z\ssigma_z = -\sId$, so now we have \n$$P = \s{\spm\ssigma_x,\spm\ssigma_z,\spm\sId,\spm i\ssigma_y\s}.$$\nWe can check that this set is closed under multiplication, so it is the group generated by $\s{\ssigma_x,\ssigma_z\s}$. It has 8 elements, so its order is 8. The center consists of all the elements that commute with //every// element of the group, and since none of the Pauli $\ssigma_j$ operators commute with each other, the center consists of the 2-element group $\s{\spm\sId\s}$.\n\n<<<\n* (b) How does the Bloch sphere transform under a central element of $P$ (i.e., an element of the center)? How about a noncentral element? \n<<<\n\nThe central elements of $P$ are multiples of the identity operator. Both of them leave all states unchanged (an overall factor of $-1$ does not change the state), so the Bloch sphere is invariant under the center of $P$. Each of the other elements is a unitary transformation (and therefore a rotation of the Bloch sphere) whose square is proportional to the identity operator. Therefore, it must be a rotation by $\spi$ around some axis. The eigenstates of the $\ssigma_x,\ssigma_y,\ssigma_z$ operators point (respectively) along the $\shat{x},\shat{y},\shat{z}$ axes. Since an eigenstate is unchanged by the corresponding rotation, the non-central elements of the Pauli group must correspond to rotations by $\spi$ around one of the three axes $\shat{x},\shat{y},\shat{z}$.\n\n<<<\n* (c) Now, consider the operators $\sleft\s{e^{i\sfrac{\spi}{4}\ssigma_x},e^{i\sfrac{\spi}{4}\ssigma_z}\sright\s}\s,$. What is the order of the group $C$ that they generate? How many elements are in its center?\n<<<\n\nA geometrical perspective on this problem is very helpful. Each generator is a rotation of the Bloch sphere around either the $\shat{x}$ or $\shat{z}$ axis (why? because for any operator $A$, $e^A$ commutes with $A$, so the eigenstates of the operator in the exponent are eigenstates of the unitary). If we raise either generator to the 4th power, we get $e^{i\spi\ssigma_j} = -\sId$. This tells us two things:\n# The group contains both $\sId$ (as the 8th power of either generator) and $-\sId$, so for every operator $U$ in the group, $-U$ will also be in it.\n# Each generator must be a rotation by $\sfrac{\spi}{2}$ around its respective axis, since repeating it 4 times gives $-\sId$, which is the identity rotation (remember, global phase is irrelevant).\n\n[>img(35%,auto)[images/HW3/Xrot.png]]\n\nWe can also deduce another useful property: each of the generators has determinant 1 (because it has two eigenvalues $e^{\spm i \spi/4}$), and since $\sdet(AB) = \sdet(A)\sdet(B)$, every element in the group will also have determinant 1. From this, it immediately follows that there are no //other// multiples of $\sId$ in the group other than $\sId$ and $-\sId$, which in turn means that for any operator $A$ in the group, the only multiples of $A$ in the group are $\spm A$ (this follows because $A$ must have an inverse $A^{-1}$ in the group, and if the group contains $zA$, then it contains $A^{-1}zA = z\sId$, which means $z=\spm1$).\n\nAt this point, we can completely figure out the structure of the group by considering how its elements transform the Bloch sphere. The generators are $\sfrac{\spi}{2}$ rotations around the $\shat{x}$ and $\shat{z}$ axes, which means that they permute the six states $\s{\sket\suparrow,\sket\sdownarrow,\sket\srightarrow,\sket\sleftarrow,\sket\sinarrow,\sket\soutarrow\s}$. For instance, $e^{i\sfrac{\spi}{4}\ssigma_z}$ leaves the $\sket\suparrow$ and $\sket\sdownarrow$ states fixed, but cyclically permutes $\sket\srightarrow \srightarrow \sket\sinarrow \srightarrow \sket\sleftarrow \srightarrow \sket\soutarrow \srightarrow \sket\srightarrow$. Similarly, $e^{i\sfrac{\spi}{4}\ssigma_x}$ leaves the $\sket\srightarrow$ and $\sket\sleftarrow$ states fixed, but cyclically permutes $\sket\suparrow \srightarrow \sket\sinarrow \srightarrow \sket\sdownarrow \srightarrow \sket\soutarrow \srightarrow \sket\srightarrow$.\n\n[>img(35%,auto)[images/HW3/Zrot.png]]\n\nBy composing these rotations, we can rotate the $\sket\suparrow$ state to any of the six points that we choose (including leaving it right where it sits). There are four distinct such rotations in the group -- i.e., once we've rotated the $\sket\suparrow$ state to any desired point, we can put the $\sket\srightarrow$ state in any of 4 possible positions. These two choices completely define the rotation. This yields 24 different rotations (the [[rotational symmetry group of the octahedron|http://en.wikipedia.org/wiki/Octahedral_symmetry]]!), all of which can be obtained by applying the generators in sequence. We're not done yet, because (as we observed before) for every rotation matrix $U$, $-U$ is also part of the group -- but only these two multiples! Therefore, our group $C$ contains 48 elements (2 copies of each octahedral rotation), and its order is 48. The only operator that commutes with all these rotations is $\sId$ and its multiples; the group contains $\spm\sId$, so the center contains 2 elements.\n\n<<<\n* (d) Is $P$ a subgroup of $C$?\n<<<\n\nNo. A subgroup of $C$ is simply a subset of $C$ that happens to be a group; we already determined that $P$ is a group, but we also determined that $C$ contains //only// matrices with determinant 1. $P$ contains $i\ssigma_y$, whose determinant is $-1$, and therefore it is not contained in $C$.\n\nHowever, if you forgot about the subtle phase issue and just argued (correctly) that $P$ consists of $\spi/2$ rotations around the $\shat{x},\shat{y},\shat{z}$ axes, and that each of these transformations is also in $C$, then you get half credit anyway for good physical intuition.\n
<<<\n5. ''Distributions''\nConsider the [[Dirac delta distribution|http://en.wikipedia.org/wiki/Dirac_delta]] $\sdelta(x)$, defined as a [[linear functional|http://en.wikipedia.org/wiki/Linear_functional]] from complex functions $f(x)$ to complex scalars by \n$$\sbraket{\sdelta}{f} = \sint_{-\sinfty}^{\sinfty}{\sdelta(x)f(x)\sdiff x} \sequiv f(0).$$\n[[Distributions|http://en.wikipedia.org/wiki/Distribution_(mathematics)]] are dual vectors to [[well-behaved|http://en.wikipedia.org/wiki/Well-behaved]] [[test functions|http://en.wikipedia.org/wiki/Schwartz_function]], meaning that this inner product is only guaranteed to be defined when $f(x)$ satisfies certain properties.\n* (a) Provide an explicit example of a function $f(x)$ for which $\sbraket{\sdelta}{f(x)}$ is not defined.\n<<<\n\nBasically, any function where $f(0)$ is undefined or infinite. A particularly nice example is $f(x) = sin(1/x)$, which is bounded, and is also smooth and defined everywhere except at $x=0$, but has no limit as $x\srightarrow0$.\n\n<<<\n* (b) Well-behaved test functions are infinitely differentiable, and decay smoothly and rapidly to zero at $x=\spm\sinfty$. Use this fact, and integration by parts, to define $\sdelta'(x) = \spd{~}{x}\sdelta(x)$ as a linear functional on test functions.\n<<<\n\nThe delta distribution is defined by $\sbraket{\sdelta}{f} = \sint{\sdelta(x)f(x)\sdiff x} = f(0)~$. To define $\sdelta'(x) = \spd{~}{x}\sdelta(x)$, we insert it into an integral in the same way\n$$\sbraket{\sdelta'}{f} = \sint{\spd{~}{x}\sdelta(x)f(x)\sdiff x},$$\nand then use integration by parts\n$$\sint{\spd{~}{x}\sdelta(x)f(x)\sdiff x} = \sleft[\sdelta(x)f(x)\sright]\s!\s!\s!{\sbegin{array}{c} {\sscriptstyle \sinfty} \s\s {\sscriptstyle -\sinfty}\send{array}} \s - \sint{\sdelta(x)\spd{~}{x}f(x)\sdiff x} = -\sint{\sdelta(x)\spd{~}{x}f(x)\sdiff x} =-f'(0),$$\nwhere the bracketed term vanishes because both $\sdelta(x)$ and $f(x)$ vanish as $|x|\srightarrow\sinfty$. Thus, our final result is that the derivative of the delta distribution is defined as\n$$\sbraket{\sdelta'}{f} = -f'(0).$$\n\n<<<\n* (c) Provide an explicit example of a function $f(x)$ for which $\sbraket{\sdelta}{f(x)}$ is defined, but $\sbraket{\sdelta'(x)}{f(x)}$ is //not// defined.\n<<<\n\nAny function which is well-defined at $x=0$, but not differentiable. A nice example is $f(x) = e^{-|x|}$.\n\nThis demonstrates why test functions need to be infinitely differentiable -- we can define the $n$th derivative of $\sdelta(x-x_0)$ and use it to evaluate the $n$th derivative of the test function at any point $x_0$.\n\n<<<\n* (d) The Dirac distribution is not a function. Define an infinite sequence of functions $\s{g_1(x), g_2(x),\sldots\s}$ that converges to $\sdelta(x)$ in the sense that $\sbraket{g_k(x)}{f(x)}$ converges to $\sbraket{\sdelta(x)}{f(x)}$ for any well-behaved test function $f(x)$ as $k\srightarrow\sinfty$. Prove that although this sequence converges in that sense (i.e., in the distribution topology), it is not [[Cauchy|http://en.wikipedia.org/wiki/Cauchy_sequence]] in the metric $d(f,g)=1-\sfrac{|\sbraket{f}{g}|^2}{\sbraket{f}{f}\sbraket{g}{g}}$.\n<<<\n\nAgain, there are multiple solutions to this. The key thing is that each $g_k(x)$ should integrate to $1$, and should be almost entirely confined to a vanishingly small interval around $x=0$ as $k\srightarrow\sinfty$. Two canonical solutions are\n$$g_k(x) = \sleft\s{\sbegin{array}{l} k/2 & \smathrm{\s if\s }|x|<k^{-1} \s\s 0 & \smathrm{\s otherwise}\send{array}\sright.$$\nand\n$$g_k(x) = \ssqrt{\sfrac{k}{\spi}}e^{-k^2x^2}.$$\nIn either case, we can show that $\sint{g_k(x)f(x)\sdiff x}$ converges to $f(0)$ by expanding $f(x)$ in a Taylor series around $x=0$, which must be possible (and convergent) because $f(x)$ is smooth.\n\nA sequence $\s{g_k\s}$ is Cauchy if, for every $\sepsilon>0$, there exists an $N$ such that for all $n,m>N$, $d(g_n,g_m)<\sepsilon$. Intuitively, this means that all the terms beyond $N$ lie inside a ball of diameter $\sepsilon$, which is clearly a strong notion of convergence. In order to show that either of the previous sequences of functions is //not// Cauchy, it is sufficient to show that the sequence has a subsequence that is not Cauchy (for every subsequence of a Cauchy sequence is also Cauchy).\n\nWe choose the subsequence where $k=2^n$ for $n=1,2,3,\sldots$, and consider the inner product between consecutive terms $\sbraket{g_{2^n}}{g_{2^{n+1}}}$. For the first sequence,\n$$\sbraket{g_{2^n}}{g_{2^{n+1}}} = \sleft(\sfrac{2^n}{2}\sright)\sleft(\sfrac{2^{n+1}}{2}\sright)(2)(2^{-(n+1)}) = 2^{n-1}$$\nand the norms of $g_{2^n}$ and $g_{2^{n+1}}$ are\n$$\sbraket{g_{2^n}}{g_{2^n}} = 2^{n-1}$$\nand\n$$\sbraket{g_{2^{n+1}}}{g_{2^{n+1}}} = 2^{n}$$\nso \n$$d(g_{2^n},g_{2^{n+1}}) = 1-\sfrac{2^{2n-2}}{2^n2^{n-1}} = \sfrac12.$$\nEach term in the subsequence is a distance $1/2$ from the previous term, so this sequence of functions is not Cauchy -- i.e., as a sequence of square-integrable functions, it does not converge at all.\nFor the second sequence,\n$$\sbraket{g_{2^n}}{g_{2^{n+1}}} = \ssqrt{\sfrac{2^n}{\spi}}\ssqrt{\sfrac{2^{n+1}}{\spi}} \sint{e^{-(2^{2n}+2^{2n+2})x^2}\sdiff x} = \sfrac{2^{n+1}}{\ssqrt{5\spi}}$$\nand the norms are\n$$\sbraket{g_{2^n}}{g_{2^n}} = \sfrac{2^n~}{\ssqrt{2\spi}}$$\nand\n$$\sbraket{g_{2^{n+1}}}{g_{2^{n+1}}} = \sfrac{2^{n+1}}{\ssqrt{2\spi}}$$\nso\n$$d(g_{2^n},g_{2^{n+1}}) = 1-4/5 = \sfrac15.$$\nOnce again, the sequence is not Cauchy. In fact, this will always be true -- the $\sdelta$ distribution is not square-integrable, but the Riesz-Fischer theorem proves that the set of square-integrable functions $L^2(\sreals)$ is //complete// -- i.e., it contains the limit points of all its Cauchy sequences. The $\sdelta$ distribution is not the limit of any Cauchy sequence.\n\n
<<<\n1. ''Symmetric and antisymmetric subspaces, and the permutation group'' (30 points)\nA //permutation// on $N$ objects (the objects are conventionally labeled $\s{1,2,3\sldots N\s}$), is a map $\spi:\s{1,2,3\sldots N\s}\sto\s{1,2,3\sldots N\s}$ satisfying $\spi(j)=\spi(k)$ if and //only// if $j=k$. In quantum theory, for each permutation $\spi$ on $N$ objects, there is a corresponding unitary operator $\shat{P}_\spi$ that acts on an $N$-particle Hilbert space.\n**(a) Prove the following three properties of the permutations on $N$ objects (for a fixed integer $N>0$): (5 points)\n***(i) There are $N!$ distinct permutations of $N$ objects.\n<<<\n\nRecall that $\spi$ is a map from $\s{1\sldots N\s}$ to $\s{1\sldots N\s}$, and $\spi(j)=\spi(k)\s \sLeftrightarrow j=k$. Therefore, there are $N$ possible values for $\spi(1)$, but only $N-1$ for $\spi(2)$ (since $\spi(1)\sneq\spi(2)$). Similarly, there are $N-2$ choices for $\spi(3)$, and in general there are $N+1-k$ choices for $\spi(k)$ if we have already chosen $\spi(1)\sldots\spi(k-1)$. Thus, the total number of permutations is $N(N-1)(N-2)\sldots(2)(1) \sequiv N!$.\n\n<<<\n***(ii) The set of all permutations of $N$ objects forms a group (under the standard composition operation whereby $\spi_1\spi_2$ means "first apply $\spi_2$, then $\spi_1$").\n<<<\n\nDenote by $S_N$ the set of all permutations of $N$ objects The three properties of a group are:\n# //It must be closed under the binary composition operation:// Define $\spi = \spi_1\scirc\spi_2$, so $\spi(j) = \spi_1(\spi_2(j))$. Since $\spi_1$ and $\spi_2$ are functions from $\s{1\sldots N\s}\srightarrow\s{1\sldots N\s}$, $\spi$ is as well. Furthermore, $\spi(j) = \spi(k)$ if and only if $\spi_2(j) = \spi_2(k)$, which is true if and only if $j=k$. Therefore, $\spi$ is a permutation.\n# //There must be an identity element:// Define the permutation $\spi_0(j) = j$. Then for any permutation $\spi$, $(\spi_0\scirc\spi)(j) = \spi_0(\spi(j)) = \spi(j)$, so $\spi_0$ is the identity element of $S_N$.\n# //Each element must have an inverse:// Let $\spi$ be an arbitrary permutation. Because $\spi(j)=\spi(k)$ iff $j=k$, for every $j$ there exists a __unique__ $\spi(j)$. Therefore, we can define the permutation $\spi^{-1}$ such that for each $j$, $\spi^{-1}(\spi(j)) = j$. Therefore, $\spi^{-1}\scirc\spi = \spi_0$, and $\spi^{-1}$ is the inverse element for $\spi$.\n\n<<<\n***(iii) Every permutation on $N$ objects can be written as a composition of at most $N$ //swaps//, where the swap $\spi_{i,j}$ maps $i\sleftrightarrow j$ and leaves all other objects alone.\n<<<\n\nThere are a few different ways to show this, but here is one of the most elegant. I'm going to show that the //inverse// of any permutation can be built out of (at most) $N$ swaps. That is, given any $\spi$, there exist at most $N$ swaps $\spi_{i_n,j_n}$ such that\n$$\spi_{i_N,j_N}\scirc\spi_{i_{N-1},j_{N-1}}\scirc\sldots\scirc\spi_{i_1,j_1}\spi = \spi_0,$$\nwhere $\spi_0$ is the identity permutation. Since every $\spi$ is the inverse of $\spi^{-1}$, by applying this reasoning to $\spi^{-1}$, we prove the theorem.\n\nThe proof is a simple application of finite induction. Let $\spi$ be a permutation on $N$ elements. If $N=1$, then $\spi=\spi_0$ and the problem is trivial (i.e., $\spi$ can be built out of zero swaps). If $N>1$, then either:\n### $\spi(1)=1$, in which case $\spi$ already maps element 1 to itself, and so is merely a permutation of the $N-1$ elements $\s{2\sldots N\s}$. Define $\spi'=\spi$. //or//\n### $\spi(1) = k$ for some $k\sneq1$. In this case, we define $\spi' = \spi_{1,k}\scirc\spi$. Now, $\spi'(1) = \spi_{1,k}(\spi(1)) = \spi_{1,k}(k) = 1$, so $\spi'$ is merely a permutation of the $N-1$ elements $\s{2\sldots N\s}$.\nIn either case, $\spi'$ is a permutation on $N-1$ elements (since $\spi'(1)=1$). Applying this procedure $N-1$ times, we end up with a permutation on 1 element, which must be the identity permutation:\n$$\sleft(\spi_{1,\spi(1)}\scirc\spi_{2,\spi'(2)}\scirc\spi_{3,\spi''(3)}\sldots\sright)\spi = \spi_0,$$\nand therefore the product of $N-1$ swaps must be equal to $\spi^{-1}$. So, in fact, I've shown that a permutation on $N$ objects can be written as a composition of at most $N-1$ swaps, which just shows that I was a little careless when I wrote the problem!\n\n<<<\n**(b) List all 6 permutation operators for $N=3$, and define the action of each one by specifying how it acts on an arbitrary element of the tensor product basis $\sket{i}\sotimes\sket{j}\sotimes\sket{k} \sequiv \sket{ijk}$. (5 points)\n<<<\n\nThere are various ways to denote the permutations, and it doesn't matter for this problem //how// you denote them, as long as you correctly list their action on $\sket{i}\sotimes\sket{j}\sotimes\sket{k}$. (In particular, if you use cycle notation and switch the permutations $(123)$ and $(132)$, that's okay -- it's sort of a matter of convention). I'll use the popular //cycle notation//:\n\sbegin{eqnarray}\n\shat{P}_{(1)(2)(3)}\sket{ijk} &=& \sket{ijk} \s\s\n\shat{P}_{(12)(3)}\sket{ijk} &=& \sket{jik} \s\s\n\shat{P}_{(1)(23)}\sket{ijk} &=& \sket{ikj} \s\s\n\shat{P}_{(13)(2)}\sket{ijk} &=& \sket{kji} \s\s\n\shat{P}_{(123)}\sket{ijk} &=& \sket{kij} \s\s\n\shat{P}_{(132)}\sket{ijk} &=& \sket{jki}\n\send{eqnarray}\n\n<<<\n**(c) Prove that\n$$\shat{\sPi}_{\smathrm{symm}} \sequiv \sfrac{1~}{N!}\ssum_{\spi}{\shat{P}_\spi}$$\nis the projector onto the subspace of states that are invariant under all permutations, by showing that: (i) $\shat{\sPi}_{\smathrm{symm}}^2=\shat{\sPi}_{\smathrm{symm}}~$; (ii) for any $\sket{\spsi}$, $\shat{\sPi}_{\smathrm{symm}}\sket\spsi$ is invariant under any permutation; and (iii) if $\sket{\spsi}$ is already invariant under all permutations, then $\shat{\sPi}_{\smathrm{symm}}\sket{\spsi}=\sket{\spsi}$. (10 points)\n<<<\n\nLet's begin with a very useful lemma: //Given any permutation operator $\shat{P}_\spi$, $\shat{P}_{\spi}\shat{\sPi}_{\smathrm{symm}} = \shat{\sPi}_{\smathrm{symm}}$.// To prove this, we observe that if $\spi,\spi_1,\spi_2$ are all permutations, then $\shat{P}_\spi\shat{P}_{\spi_1} = \shat{P}_{\spi}\shat{P}_{\spi_2}$ if and only if $\spi_1=\spi_2$ (just multiply on both sides by $\shat{P}_{\spi^{-1}}$). Now,\n$$\shat{P}_{\spi}\shat{\sPi}_{\smathrm{symm}} = \sfrac{1~}{N!}\ssum_{\spi'}{\shat{P}_{\spi}\shat{P}_{\spi'}},$$\nand because there are $N!$ distinct permutation operators $\shat{P}_{\spi'}$ in the sum, the operators $\shat{P}_{\spi}\shat{P}_{\spi'}$ must //also// be $N!$ distinct permutation operators (because $\shat{P}_\spi\shat{P}_{\spi_1} = \shat{P}_{\spi}\shat{P}_{\spi_2}$ if and only if $\spi_1=\spi_2$). Since there are only $N!$ permutation operators, each one must appear exactly once in the sum, so\n$$\shat{P}_{\spi}\shat{\sPi}_{\smathrm{symm}} = \shat{\sPi}_{\smathrm{symm}}.$$\n\nNow, using this lemma, we can easily prove that:\n(i) $\shat{\sPi}_{\smathrm{symm}}^2=\shat{\sPi}_{\smathrm{symm}}~$. This follows because\n$$ \shat{\sPi}_{\smathrm{symm}}^2 = \sfrac{1~}{N!}\ssum_{\spi}{\shat{P}_{\spi}\shat{\sPi}_{\smathrm{symm}}} = \sfrac{1~}{N!}\ssum_{\spi}{\shat{\sPi}_{\smathrm{symm}}} = \shat{\sPi}_{\smathrm{symm}}.$$\n(ii) For any $\sket\spsi$ and any permutation $\spi$, the state $\shat{\sPi}_{\smathrm{symm}}\sket\spsi$ is invariant under $\shat{P}_{\spi}$. This follows because\n$$ \shat{P}_{\spi}\shat{\sPi}_{\smathrm{symm}}\sket\spsi = \sleft(\shat{P}_{\spi}\shat{\sPi}_{\smathrm{symm}}\sright)\sket\spsi = \shat{\sPi}_{\smathrm{symm}}\sket\spsi.$$\n(iii) If, for every permutation $\spi$, $\shat{P}_{\spi}\sket{\spsi} = \sket{\spsi}$, then\n$$ \shat{\sPi}_{\smathrm{symm}}\sket\spsi = \sfrac{1~}{N!}\ssum_{\spi}{\shat{P}_\spi\sket\spsi} = \sfrac{1~}{N!}\ssum_{\spi}{\sket\spsi} = \sket\spsi.$$\n\nThus: (i) $\shat{\sPi}_{\smathrm{symm}}$ is a projector onto some subspace; (ii) it projects onto a subspace of states that are invariant under all permutations; and (iii) it leaves //every// permutation-invariant state unchanged. Therefore, $\shat{\sPi}_{\smathrm{symm}}$ is the projector onto the subspace of all permutation-invariant states. QED.\n\n<<<\n**(d) Each permutation $\spi$ is either //even// or //odd//, depending on whether it is the composition of an even number of swaps, or an odd number (see <http://en.wikipedia.org/wiki/Even_permutation> for more detail, if desired). The parity of $\spi$ is denoted $(-1)^\spi$, and this is $+1$ for an even permutation and $-1$ for an odd permutation. Prove that\n$$\shat{\sPi}_{\smathrm{anti}} \sequiv \sfrac{1~}{N!}\ssum_{\spi}{(-1)^\spi\shat{P}_\spi}$$\nis the projector onto the subspace of states that are -1 eigenvectors of every swap (and therefore of every odd permutation), by showing that: (i) $\shat{\sPi}_{\smathrm{anti}}^2=\shat{\sPi}_{\smathrm{anti}}~$; (ii) for any $\sket{\spsi}$, $\shat{\sPi}_{\smathrm{anti}}\sket\spsi$ is a -1 eigenvector of any swap; and (iii) if $\sket{\spsi}$ is already a -1 eigenvector of any swap, then $\shat{\sPi}_{\smathrm{anti}}\sket{\spsi}=\sket{\spsi}$. (10 points)\n<<<\n\nThis proof proceeds along the same lines as the previous one, but the useful lemma that we need is slightly different: //Given any __swap__ operator $\shat{P}_\spi$, $\shat{P}_{\spi}\shat{\sPi}_{\smathrm{anti}} = -\shat{\sPi}_{\smathrm{anti}}$.// To prove this, we follow similar reasoning as in the previous part. For each permutation operator $\shat{P}_{\spi'}$, $\shat{P}_{\spi}\shat{P}_{\spi'}$ is another permutation operator, determined //uniquely// by $\spi'$. However, if $\spi$ is a swap, then $\shat{P}_{\spi}\shat{P}_{\spi'}$ has the opposite parity from $\shat{P}_{\spi'}$, because we've added exactly one swap. Therefore,\n$$\shat{P}_{\spi}\shat{\sPi}_{\smathrm{anti}} = \sfrac{1~}{N!}\ssum_{\spi'}{\shat{P}_{\spi}\shat{P}_{\spi'}} = \sfrac{1~}{N!}\ssum_{\spi'}{-\shat{P}_{\spi'}} = -\shat{\sPi}_{\smathrm{anti}}.$$\nAn immediate corollary is: //If $\shat{P}_{\spi}$ has parity $(-1)^\spi$, then $\shat{P}_{\spi}\shat{\sPi}_{\smathrm{anti}} = (-1)^{\spi}\shat{\sPi}_{\smathrm{anti}}$.//\n\nNow, using these lemmas, we can easily prove:\n(i) $\shat{\sPi}_{\smathrm{anti}}^2=\shat{\sPi}_{\smathrm{anti}}~$. This follows because $\shat{\sPi}_{\smathrm{anti}}$ is invariant under left multiplication by $(-1)^\spi\shat{P}_{\spi}$, and $\shat{\sPi}_{\smathrm{anti}}$ is a sum of such terms.\n(ii) For any $\sket{\spsi}$ and any swap $\spi$, the state $\shat{\sPi}_{\smathrm{anti}}\sket\spsi$ is a -1 eigenvector of $\shat{P}_{\spi}$. This follows because\n$$ \shat{P}_{\spi}\shat{\sPi}_{\smathrm{anti}}\sket\spsi = \sleft(\shat{P}_{\spi}\shat{\sPi}_{\smathrm{anti}}\sright)\sket\spsi = -\shat{\sPi}_{\smathrm{anti}}\sket\spsi.$$\n(iii) If, for every swap $\spi$, $\sket{\spsi}$ is already a -1 eigenvector of $\shat{P}_{\spi}$, then $\sket\spsi$ is an eigenvector of any permutation $\spi$ with eigenvalue $(-1)^{\spi}$, so\n$$\shat{\sPi}_{\smathrm{anti}}\sket{\spsi} = \sfrac{1~}{N!}\ssum_{\spi}{(-1)^\spi\shat{P}_\spi\sket{\spsi}} = \sfrac{1~}{N!}\ssum_{\spi}{\sleft[(-1)^\spi\sright]^2 \sket{\spsi}} = \sket\spsi.$$\n\nThus: (i) $\shat{\sPi}_{\smathrm{anti}}$ is a projector onto some subspace; (ii) it projects onto a subspace of states that are -1 eigenvectors of all swaps; and (iii) it leaves //every// state that is already a -1 eigenvector of every swap unchanged. Therefore, $\shat{\sPi}_{\smathrm{anti}}$ is the projector onto the subspace of all antisymmetric states (i.e., states that are -1 eigenvectors of every odd permutation). QED.\n\n<<<\n2. ''Bell states'' (25 points)\nConsider two //distinguishable// spin-$\sfr{1~}{2~}$ particles, labeled $A$ and $B$. The four Pauli operators for a single particle are $\s{\sId,\ssigma_x,\ssigma_y,\ssigma_z\s}$, and the two-particle Pauli operators are the 16 tensor products ($\s{\sId,\ssigma_x,\ssigma_y,\ssigma_z\s}_A\sotimes\s{\sId,\ssigma_x,\ssigma_y,\ssigma_z\s}_B$) of those operators -- e.g., $\s{\sId_{AB},\ssigma_x^{(A)}\sotimes\ssigma_z^{(B)},\sldots\s}$.\n**(a) Write the operator $\sproj{\spsi}$ as a linear combination of two-particle Pauli operators, where $\sket\spsi$ is given by: (5 points)\n$$\sket{\spsi} = \sfr{1~}{\ssqrt2~}\sleft(\sket{\suparrow\sdownarrow} - \sket{\sdownarrow\suparrow}\sright).$$\n<<<\n\nThe projector onto this state is:\n\sbegin{eqnarray}\n\sproj{\spsi} &=& \sfrac{1~}{2~}\sleft(\sproj{\suparrow\sdownarrow}+\sproj{\sdownarrow\suparrow}-\sketbra{\suparrow\sdownarrow}{\sdownarrow\suparrow}-\sketbra{\sdownarrow\suparrow}{\suparrow\sdownarrow}\sright) \s\s\n&=& \sfrac{1~}{2~}\sleft( \sproj{\suparrow}\sotimes\sproj{\sdownarrow} + \sproj{\sdownarrow}\sotimes\sproj{\suparrow} - \sketbra{\suparrow}{\sdownarrow}\sotimes\sketbra{\sdownarrow}{\suparrow} - \sketbra{\sdownarrow}{\suparrow}\sotimes\sketbra{\suparrow}{\sdownarrow} \sright)\n\send{eqnarray}\nTo write this in terms of Pauli operators, here are some simple identities:\n\sbegin{eqnarray}\n\sproj{\suparrow} &=& \smat{1&0\s\s0&0} = \sfrac{\sId+\ssigma_z}{2} \s\s\n\sproj{\sdownarrow} &=& \smat{0&0\s\s0&1} = \sfrac{\sId-\ssigma_z}{2} \s\s\n\sketbra{\suparrow}{\sdownarrow} &=& \smat{0&1\s\s0&0} = \sfrac{\ssigma_x+i\ssigma_y}{2} \s\s\n\sketbra{\sdownarrow}{\suparrow} &=& \smat{0&0\s\s1&0} = \sfrac{\ssigma_x-i\ssigma_y}{2}\n\send{eqnarray}\nNow we go back to the previous expression and expand it out using these identities:\n\sbegin{eqnarray}\n\sproj{\spsi} &=& \sfrac{1~}{2~}\sleft( \sproj{\suparrow}\sotimes\sproj{\sdownarrow} + \sproj{\sdownarrow}\sotimes\sproj{\suparrow} - \sketbra{\suparrow}{\sdownarrow}\sotimes\sketbra{\sdownarrow}{\suparrow} - \sketbra{\sdownarrow}{\suparrow}\sotimes\sketbra{\suparrow}{\sdownarrow} \sright) \s\s\n&=& \sfrac{1~}{8~}\sleft[ (\sId+\ssigma_z)\sotimes(\sId-\ssigma_z) + (\sId-\ssigma_z)\sotimes(\sId+\ssigma_z) - (\ssigma_x+i\ssigma_y)\sotimes(\ssigma_x-i\ssigma_y) - (\ssigma_x-i\ssigma_y)\sotimes(\ssigma_x+i\ssigma_y) \sright] \s\s\n&=& \sfrac{1~}{8~}\sleft[ \sbegin{array}{l} \s (\sId\sotimes\sId + \ssigma_z\sotimes\sId - \sId\sotimes\ssigma_z - \ssigma_z\sotimes\ssigma_z) + (\sId - \ssigma_z\sotimes\sId + \sId\sotimes\ssigma_z - \ssigma_z\sotimes\ssigma_z) \s\s - (\ssigma_x\sotimes\ssigma_x+i\ssigma_y\sotimes\ssigma_x -i\ssigma_x\sotimes\ssigma_y+\ssigma_y\sotimes\ssigma_y) - (\ssigma_x\sotimes\ssigma_x-i\ssigma_y\sotimes\ssigma_x +i\ssigma_x\sotimes\ssigma_y+\ssigma_y\sotimes\ssigma_y)\send{array} \sright] \s\s\n&=& \sfrac{1~}{4~}\sleft[ \sId \sotimes \sId - \ssigma_z\sotimes\ssigma_z - \ssigma_x\sotimes\ssigma_x - \ssigma_y\sotimes\ssigma_y \sright]\n\send{eqnarray}\n\n<<<\n**(b) Show that by acting on $\sket{\spsi}$ with each of the four Pauli operators for system $A$ alone, we obtain four orthogonal states $\s{\sket{\spsi},\sket{\spsi_x},\sket{\spsi_y},\sket{\spsi_z}\s}$. (5 points)\n<<<\n\nThere are a couple of ways to go about this, but the easiest (though not the most elegant) is to just work out the states explicitly and confirm that they're mutually orthogonal:\n\sbegin{eqnarray}\n\sket{\spsi} &=& \sfr{1~}{\ssqrt2~}\sleft(\sket{\suparrow\sdownarrow} - \sket{\sdownarrow\suparrow}\sright) \s\s\n\sket{\spsi_x} = \ssigma_x^{(A)}\sket\spsi &=& \sfr{1~}{\ssqrt2~}\sleft(\sket{\sdownarrow\sdownarrow} - \sket{\suparrow\suparrow}\sright) \s\s\n\sket{\spsi_y} = \ssigma_y^{(A)}\sket\spsi &=& \sfr{i~}{\ssqrt2~}\sleft(\sket{\sdownarrow\sdownarrow} + \sket{\suparrow\suparrow}\sright)\s\s\n\sket{\spsi_z} = \ssigma_z^{(A)}\sket\spsi &=& \sfr{1~}{\ssqrt2~}\sleft(\sket{\suparrow\sdownarrow} + \sket{\sdownarrow\suparrow}\sright)\n\send{eqnarray}\nwhere I've used the standard convention that an operator $\shat{Q}_A$ that is defined on the $A$ subsystem acts on the $AB$ system as $\shat{Q}_A\sotimes\sId_B$.\n\nNow, there are six distinct inner products to calculate:\n\sbegin{eqnarray}\n\sbraket{\spsi}{\spsi_x} &=& \sfrac12\sleft(\sbraket{\suparrow\sdownarrow}{\sdownarrow\sdownarrow} - \sbraket{\suparrow\sdownarrow}{\suparrow\suparrow} - \sbraket{\sdownarrow\suparrow}{\sdownarrow\sdownarrow} + \sbraket{\sdownarrow\suparrow}{\suparrow\suparrow}\sright) = 0 \s\s\n\sbraket{\spsi}{\spsi_y} &=& \sfrac{i}2\sleft(\sbraket{\suparrow\sdownarrow}{\sdownarrow\sdownarrow} + \sbraket{\suparrow\sdownarrow}{\sdownarrow\sdownarrow} - \sbraket{\sdownarrow\suparrow}{\suparrow\suparrow} - \sbraket{\sdownarrow\suparrow}{\suparrow\suparrow}\sright) = 0 \s\s\n\sbraket{\spsi}{\spsi_z} &=& \sfrac12\sleft(\sbraket{\suparrow\sdownarrow}{\suparrow\sdownarrow} + \sbraket{\suparrow\sdownarrow}{\sdownarrow\suparrow} - \sbraket{\sdownarrow\suparrow}{\suparrow\sdownarrow} - \sbraket{\sdownarrow\suparrow}{\sdownarrow\suparrow}\sright) = 1-1 = 0 \s\s\n\sbraket{\spsi_x}{\spsi_y} &=& \sfrac{i}2\sleft(\sbraket{\sdownarrow\sdownarrow}{\sdownarrow\sdownarrow} + \sbraket{\sdownarrow\sdownarrow}{\suparrow\suparrow} - \sbraket{\suparrow\suparrow}{\sdownarrow\sdownarrow} - \sbraket{\suparrow\suparrow}{\suparrow\suparrow}\sright) = i(1-1) = 0 \s\s\n\sbraket{\spsi_x}{\spsi_z} &=& \sfrac12\sleft(\sbraket{\sdownarrow\sdownarrow}{\suparrow\sdownarrow} + \sbraket{\sdownarrow\sdownarrow}{\sdownarrow\suparrow} - \sbraket{\suparrow\suparrow}{\suparrow\sdownarrow} - \sbraket{\suparrow\suparrow}{\sdownarrow\suparrow}\sright) = 0 \s\s\n\sbraket{\spsi_y}{\spsi_z} &=& \sfrac{i}2\sleft(\sbraket{\sdownarrow\sdownarrow}{\suparrow\sdownarrow} + \sbraket{\sdownarrow\sdownarrow}{\sdownarrow\suparrow} + \sbraket{\suparrow\suparrow}{\suparrow\sdownarrow} + \sbraket{\suparrow\suparrow}{\sdownarrow\suparrow}\sright) = 0\n\send{eqnarray}\n\n<<<\n**(c) Suppose that one of these four states is prepared by an outside party. Show that no measurement acting //only// on the $A$ subsystem, or //only// on the $B$ subsystem, can distinguish which of the four states was prepared. (5 points)\n<<<\n\nThe probabilities for measurements //only// on the $A$ subsystem can be calculated from the reduced density matrix $\srho_A$. Similarly, probabilities for measurements //only// on the $B$ system can be calculated from $\srho_B$. Starting with the state $\sket\spsi$ and using the results of part (a), we have\n\sbegin{eqnarray}\n\srho_A &=& \sTr_B[\sproj{\spsi}] \s\s\n&=& \sfrac14 \sTr_B\sleft[ \sId \sotimes \sId - \ssigma_z\sotimes\ssigma_z - \ssigma_x\sotimes\ssigma_x - \ssigma_y\sotimes\ssigma_y \sright] \s\s\n&=& \sfrac12\sId \s\s\n\srho_B &=& \sTr_A[\sproj{\spsi}] \s\s\n&=& \sfrac14 \sTr_A\sleft[ \sId \sotimes \sId - \ssigma_z\sotimes\ssigma_z - \ssigma_x\sotimes\ssigma_x - \ssigma_y\sotimes\ssigma_y \sright] \s\s\n&=& \sfrac12\sId\n\send{eqnarray}\nNow, each of the Pauli operators $\ssigma_k$ is a unitary operator, and when we apply $\ssigma_k^{(A)}$ to $\sket\spsi$ (to produce $\sket{\spsi_x},\sket{\spsi_y},\sket{\spsi_z}$), the reduced density matrix on the $A$ system transforms as\n$$\srho_A \sto \ssigma_k\srho_A\ssigma_k^\sdagger.$$\nHowever, since $\srho_A=\sId/2$ commutes with every unitary, //all these states have exactly the same $\srho_A$//! Therefore, each state predicts __exactly__ the same probabilities for the outcomes of every measurement on the $A$ subsystem, and they cannot be distinguished by any measurement.\n\nIf we consider measurements on the $B$ subsystem, there is an even easier argument. Each of the unitary operators $\ssigma_k^{(A)}$ used to create the $\sket{\spsi_k}$ states acts only on the $A$ subsystem -- so they commute with every operator on the $B$ subsystem! Therefore, they cannot change $\srho_B$ at all, and so all the states have the same $\srho_B$, and therefore cannot be distinguished by any measurement on the $B$ subsystem alone.\n\n<<<\n**(d) Explain why "measure $\ssigma_z$ on $A$ //and// measure $\ssigma_z$ on $B$" and "measure the observable $\ssigma_z\sotimes\ssigma_z$" describe __different__ measurements. (5 points)\n<<<\n\nMeasuring $\ssigma_z$ on $A$ yields one of two outcomes ("up" or "down"), and so does measuring $\ssigma_z$ on $B$. Therefore, this measurement has four distinct outcomes:\n$$\smathcal{M} = \s{\sproj{\suparrow\suparrow},\sproj{\suparrow\sdownarrow},\sproj{\sdownarrow\suparrow},\sproj{\sdownarrow\sdownarrow}\s}.$$\n\nThe observable $\ssigma_z\sotimes\ssigma_z$, on the other hand, has only two eigenvalues,\n$$\ssigma_z\sotimes\ssigma_z = \smat{1&0&0&0\s\s0&-1&0&0\s\s0&0&-1&0\s\s0&0&0&1},$$\nand therefore the measurement of $\ssigma_z\sotimes\ssigma_z$ has only two possible outcomes. This measurement does not distinguish between $\sket{\suparrow\sdownarrow}$ and $\sket{\sdownarrow\suparrow}$, for instance.\n\n<<<\n**(e) Suppose $\sket\spsi$ is prepared, Alice measures one of $\s{\ssigma_x,\ssigma_y,\ssigma_z\s}$ on subsystem $A$, and Bob measures the exact same observable on $B$. Show that their results will be perfectly anticorrelated -- i.e., if Alice observes an eigenstate with a +1 eigenvalue, then Bob will always observe the eigenstate with a -1 eigenvalue. (5 points)\n<<<\n\nAgain, there are a couple of ways to do this. Here's the most elegant that I know of.\n\nFirst, note that each of these measurements has exactly two values, $\s{+1,-1\s}$. Therefore, the expectation value of $\ssigma_k\sotimes\ssigma_k$ must be a weighted average of $(+1)(+1)=1$, $(+1)(-1)=-1$, $(-1)(+1) = -1$, and $(-1)(-1)=1$. So, let's compute the expectation value of $\ssigma_k\sotimes\ssigma_k$:\n\sbegin{eqnarray}\n\sexpect{\ssigma_x\sotimes\ssigma_x} = \sbraopket{\spsi}{\ssigma_x\sotimes\ssigma_x}{\spsi} &=& \sTr[\sproj{\spsi}\ssigma_x\sotimes\ssigma_x] \s\s\n&=& \sfrac{1}{4}\sTr[(\sId \sotimes \sId - \ssigma_z\sotimes\ssigma_z - \ssigma_x\sotimes\ssigma_x - \ssigma_y\sotimes\ssigma_y)(\ssigma_x\sotimes\ssigma_x)] \s\s\n&=& \sfrac{1}{4}\sTr[\ssigma_x \sotimes \ssigma_x + \ssigma_y\sotimes\ssigma_y - \sId\sotimes\sId + \ssigma_z\sotimes\ssigma_z] \s\s\n&=& -1 \s\s\n\sexpect{\ssigma_y\sotimes\ssigma_y} &=& \sTr[\sproj{\spsi}\ssigma_y\sotimes\ssigma_y] \s\s\n&=& -1 \s\s\n\sexpect{\ssigma_z\sotimes\ssigma_z} &=& \sTr[\sproj{\spsi}\ssigma_z\sotimes\ssigma_z] \s\s\n&=& -1\n\send{eqnarray}\nNow, as observed previously, Alice and Bob are not "measuring $\ssigma_k\sotimes\ssigma_k$" -- but they __can__ compute its expectation value from what they measure. The fact that all those expectation values are $-1$ means that whenever Alice gets $+1$, Bob //has// to get $-1$, and vice-versa. Otherwise, if they got the same value some of the time, the expectation value of $\ssigma_k\sotimes\ssigma_k$ wouldn't be exactly $-1$.\n\n<<<\n3. ''The no-cloning theorem'' (15 points)\nSuppose that an inventor comes to you, claiming to have designed a device that clones quantum states. He claims that if you feed in one system prepared in an arbitrary state $\sket\spsi$, and another "ancilla" system with the same Hilbert space, but prepared in a fixed state $\sket{0}$, then his device will leave //both// systems in the state $\sket{\spsi}$. Show that this violates the rule that quantum transformations must be linear maps on density matrices, i.e., $\smathcal{T}(\srho_1+\srho_2) = \smathcal{T}(\srho_1)+\smathcal{T}(\srho_2)$.\n<<<\n\nThe proof of this is really quite easy -- but it's so startlingly important for understanding modern quantum theory that I've assigned it 15 points. Also, it took until 1982 for anybody to think of it!\n\nThis guy claims that he has a device that implements a transformation $\smathcal{T}$ on a two-system Hilbert space $\smathcal{H}_{A}\sotimes\smathcal{H}_B$, which has the following property:\n$$\smathcal{T}:\sket{\spsi}\sotimes\sket{0} \sto \sket{\spsi}\sotimes\sket{\spsi}.$$\nNow, this map $\smathcal{T}$ might not be a reversible map, so really we should consider it as a map from density matrices to density matrices:\n$$\smathcal{T}\sleft[\sproj{\spsi}\sotimes\sproj{0}\sright] = \sproj{\spsi}\sotimes\sproj{\spsi}.$$\n\nOkay, let's assume that there are at least two orthogonal states in $\smathcal{H}_A$ (otherwise it's 1-dimensional and this whole thing is trivial), which I'll call $\sket{0}$ and $\sket{1}$. Let's define two superpositions of these states, $\sket{+} = \sfrac{\sket{0}+\sket{1}}{\ssqrt2}$ and $\sket{-} = \sfrac{\sket{0}-\sket{1}}{\ssqrt2}$. Now, let's //assume// that $\smathcal{T}$ is a linear map on density matrices (and derive a contradiction). Note, first, that\n$$\sproj{0}+\sproj{1} = \sId = \sproj{+}+\sproj{-},$$\nand therefore for any linear map $\smathcal{T}$:\n\sbegin{eqnarray}\n\smathcal{T}[ \sproj{0}\sotimes\sproj{0} ] + \smathcal{T}[ \sproj{1}\sotimes\sproj{0} ] &=& \smathcal{T}[ (\sproj{0} + \sproj{1})\sotimes\sproj{0} ] \s\s\n&=& \smathcal{T}[ \sId \sotimes \sproj{0} ] \s\s\n&=& \smathcal{T}[ (\sproj{+} + \sproj{-})\sotimes\sproj{0} ] \s\s\n&=& \smathcal{T}[ \sproj{+}\sotimes\sproj{0} ] + \smathcal{T}[ \sproj{-}\sotimes\sproj{0} ].\n\send{eqnarray}\nHowever, by the claimed definition of $\smathcal{T}$,\n\sbegin{eqnarray}\n\smathcal{T}[ \sproj{0}\sotimes\sproj{0} ] + \smathcal{T}[ \sproj{1}\sotimes\sproj{0} ] &=& \sproj{0}\sotimes\sproj{0} + \sproj{1}\sotimes\sproj{1} \s\s\n\smathcal{T}[ \sproj{+}\sotimes\sproj{0} ] + \smathcal{T}[ \sproj{-}\sotimes\sproj{0} ] &=& \sproj{+}\sotimes\sproj{+} + \sproj{-}\sotimes\sproj{-}\n\send{eqnarray}\nand these two expressions are ''not'' equal. This can be shown in a variety of ways; one easy one is to calculate the expectation value of $\ssigma_z\sotimes\ssigma_z = (\sproj{0}-\sproj{1})\sotimes(\sproj{0}-\sproj{1})$ for each of these [non-normalized] states:\n\sbegin{eqnarray}\n\sTr[ \ssigma_z\sotimes\ssigma_z(\sproj{0}\sotimes\sproj{0} + \sproj{1}\sotimes\sproj{1}) ] &=& 1\scdot 1 + (-1)\scdot(-1) = 2 \s\s\n\sTr[ \ssigma_z\sotimes\ssigma_z(\sproj{+}\sotimes\sproj{+} + \sproj{-}\sotimes\sproj{-}) ] &=& 0\scdot 0 + (0)\scdot(0) = 0.\n\send{eqnarray}\nThus, we have shown a contradiction, and the only nontrivial assumption we made is that $\smathcal{T}$ is a linear map, so this must be false.
<<<\n4. ''Measurements and quantum operations'' (30 points)\nConsider, again, two //distinguishable// spin-$\sfr{1~}{2~}$ particles, labeled $A$ and $B$. When tensor product states and operators are written, the $A$ object is given first; i.e. $\ssigma_x\sotimes\ssigma_y$ should be interpreted as $\ssigma_x^{(A)}\sotimes\ssigma_y^{(B)}$. Factors of $\shbar$ will be ignored in this problem.\n**(a) Write the Hamiltonian $H = \sproj{\suparrow}\sotimes\ssigma_x + \sproj{\sdownarrow}\sotimes\sId$ as a $4\stimes4$ operator in the standard tensor product basis $\s{\sket{\suparrow\suparrow},\sket{\suparrow\sdownarrow},\sket{\sdownarrow\suparrow},\sket{\sdownarrow\sdownarrow}\s}$, and calculate the $4\stimes4$ unitary operator $U = e^{-i(\spi/2) H}$. (5 points)\n<<<\n\n''This problem had a small but absolutely critical mistake in it'': I originally wrote $U = e^{-i\spi H}$, which leads to the trivial unitary $U = -\sId$. As a result, //any// answer for any part of this problem is worth full credit. The following solution is for the problem as it was //supposed// to be posed, with $U = e^{-i(\spi/2) H}$.\n\n$$H = \smat{\ssigma_x & 0 \s\s 0 & \sId} = \smat{0 & 1 & 0 & 0 \s\s 1 & 0 & 0 & 0 \s\s 0 & 0 & 1 & 0 \s\s 0 & 0 & 0 & 1}.$$\n\n$$U = e^{-i(\spi/2) H} = \smat{e^{-i(\spi/2)\ssigma_x} & 0 \s\s 0 & e^{-i(\spi/2)\sId}} = \smat{0 & -i & 0 & 0 \s\s -i & 0 & 0 & 0 \s\s 0 & 0 & -i & 0 \s\s 0 & 0 & 0 & -i} = -i\smat{0 & 1 & 0 & 0 \s\s 1 & 0 & 0 & 0 \s\s 0 & 0 & 1 & 0 \s\s 0 & 0 & 0 & 1}.$$\n\n<<<\n**(b) Suppose that $A$ is prepared in an arbitrary state $\sket{\spsi_0} = \salpha\sket{\suparrow}+\sbeta\sket{\sdownarrow}$, and $B$ is prepared in the state $\sket\sdownarrow$, and then $U$ is applied to the $AB$ system. Write down $\sket{\spsi_f} = U\sket{\spsi_0}$ and $\srho_f = \sTr_B{\sproj{\spsi_f}}$. (5 points)\n<<<\n\nFirst of all, the joint initial state is $\sket{\spsi_0} = \salpha\sket{\suparrow}\sotimes\sket{\sdownarrow} + \sbeta\sket{\sdownarrow}\sotimes\sket{\sdownarrow}$. Then,\n\sbegin{eqnarray}\n\sket{\spsi_f} &=& U\sket{\spsi_0} \s\s\n&=& U(\salpha\sket{\suparrow\sdownarrow} + \sbeta\sket{\sdownarrow\sdownarrow}) \s\s\n&=& -i\salpha\sket{\suparrow\suparrow} -i\sbeta\sket{\sdownarrow\sdownarrow}\n\send{eqnarray}\nand\n\sbegin{eqnarray}\n\srho_f &=& \sTr_B{\sproj{\spsi_f}} \s\s\n&=& \sTr_B {\sleft[ \salpha^*\salpha\sproj{\suparrow\suparrow} + \sbeta^*\sbeta\sproj{\sdownarrow\sdownarrow} + \salpha\sbeta^*\sketbra{\suparrow\suparrow}{\sdownarrow\sdownarrow} + \salpha^*\sbeta\sketbra{\sdownarrow\sdownarrow}{\suparrow\suparrow}\sright]} \s\s\n&=& \salpha^*\salpha\sproj{\suparrow}\sbraket{\suparrow}{\suparrow} + \sbeta^*\sbeta\sproj{\sdownarrow}\sbraket{\sdownarrow}{\sdownarrow} + \salpha\sbeta^*\sketbra{\suparrow}{\sdownarrow}\sbraket{\sdownarrow}{\suparrow} + \salpha^*\sbeta\sketbra{\sdownarrow}{\suparrow}\sbraket{\suparrow}{\sdownarrow} \s\s\n&=& |\salpha|^2\sproj{\suparrow} + |\sbeta|^2\sproj{\sdownarrow}\n\send{eqnarray}\n\n<<<\n**(c) The physical operation described in part (b) is a unitary operation on $AB$, but not on $A$, because it maps pure states to mixed states. It is, however, a linear operation on density operators. Write that linear operation $\smathcal{E}:\sproj{\spsi_0}\sto\srho_f$ as a $4\stimes4$ "superoperator" matrix in the basis of Pauli operators. (Recall that the Pauli operators form a basis for the 4-dimensional space of operators on $\smathcal{H}_A$). (10 points)\n<<<\n\nSince the Pauli operators form a basis, it's sufficient to know what $\smathcal{E}$ does to each of the Pauli operators. From the previous problem, we only know how $\smathcal{E}$ acts on states:\n$$\smathcal{E}\sleft[ (\salpha\sket{\suparrow}+\sbeta\sket{\sdownarrow})(\salpha^*\sbra{\suparrow}+\sbeta^*\sbra{\sdownarrow}) \sright] = |\salpha|^2\sproj{\suparrow} + |\sbeta|^2\sproj{\sdownarrow}.$$\nThis formula immediately tells us that: (a) $\sproj{\suparrow}$ and $\sproj{\sdownarrow}$ are unaffected by $\smathcal{E}$; and (b) the states $\sproj{\srightarrow}$, $\sproj{\sleftarrow}$, $\sproj{\sinarrow}$, and $\sproj{\soutarrow}$ all get mapped to the maximally mixed state $\srho_f = \sfr{1~}{2~}\sId$.\n\nNow, because $\smathcal{E}$ is a //linear// map, we can take advantage of the handy fact that for any Pauli operator $\ssigma_k$, the projectors onto its $\spm1$ eigenstates can be written as\n$$\sproj{\spm_k} = \sfrac{\sId \spm \ssigma_k}{2},$$\nand therefore each of the Pauli operators can be written in terms of its eigenstate projectors as\n$$\ssigma_k = \sproj{+_k} - \sproj{-_k}.$$\nWe can also write the identity operator as a //sum// of projectors,\n$$\sId = \sproj{\suparrow}+\sproj{\sdownarrow}.$$\n\nTherefore,\n\sbegin{eqnarray}\n\smathcal{E}[\sId] &=& \smathcal{E}[ \sproj{\suparrow} + \sproj{\sdownarrow} ] = \smathcal{E}[ \sproj{\suparrow} ] + \smathcal{E}[ \sproj{\sdownarrow} ] = \sproj{\suparrow}+\sproj{\sdownarrow} = \sId \s\s\n\smathcal{E}[\ssigma_x] &=& \smathcal{E}[ \sproj{\sleftarrow} - \sproj{\srightarrow} ] = \smathcal{E}[ \sproj{\sleftarrow} ] - \smathcal{E}[ \sproj{\srightarrow} ] = \sfrac{\sId-\sId}{2} = 0 \s\s\n\smathcal{E}[\ssigma_y] &=& \smathcal{E}[ \sproj{\sinarrow} - \sproj{\soutarrow} ] = \smathcal{E}[ \sproj{\sinarrow} ] - \smathcal{E}[ \sproj{\soutarrow} ] = \sfrac{\sId-\sId}{2} = 0 \s\s\n\smathcal{E}[\ssigma_z] &=& \smathcal{E}[ \sproj{\suparrow} - \sproj{\sdownarrow} ] = \smathcal{E}[ \sproj{\suparrow} ] - \smathcal{E}[ \sproj{\sdownarrow} ] = \sproj{\suparrow}-\sproj{\sdownarrow} = \ssigma_z\n\send{eqnarray}\nWe conclude that $\smathcal{E}$ acts in a very simple manner on the Pauli basis. In the operator basis $\s{\sId, \ssigma_x, \ssigma_y, \ssigma_z\s}$, $\smathcal{E}$ is\n$$\smathcal{E} = \smat{1 & 0 & 0 & 0 \s\s 0 & 0 & 0 & 0 \s\s 0 & 0 & 0 & 0 \s\s 0 & 0 & 0 & 1}.$$\n\n<<<\n**(d) Find two operators $\shat{K}_1,\shat{K}_2$ such that the linear operation $\smathcal{E}$ can be written as $\smathcal{E}(\srho_0) = \shat{K}_1\srho_0\shat{K}_1^\sdagger + \shat{K}_2\srho_0\shat{K}_2^\sdagger$. (5 points)\n<<<\n\nThere are at least two possibilities here. As we saw last time, $\smathcal{E}$ does nothing to $\sId$ or to $\ssigma_z$, but annihilates $\ssigma_x$ and $\ssigma_y$. Two possible pairs of operators are:\n\n1. $\s{\shat{K}_i\s} = \sleft\s{\sfrac{\sId}{\ssqrt{2}}, \sfrac{\ssigma_z}{\ssqrt{2}}\sright\s}$. Then\n$$\smathcal{E}(\srho) = \sfrac{1}{2}\srho + \sfrac{1}{2}\ssigma_z\srho\ssigma_z,$$\nand it's fairly easy to show that this map, indeed, maps $\sId$ and $\ssigma_z$ to themselves, while mapping $\ssigma_x$ and $\ssigma_y$ to $0$.\n\n2. $\s{\shat{K}_i\s} = \sleft\s{ \sproj{\suparrow},\sproj{\sdownarrow} \sright\s}$. Then\n$$\smathcal{E}(\srho) = \sproj{\suparrow}\srho\sproj{\suparrow} + \sproj{\sdownarrow}\srho\sproj{\sdownarrow},$$\nand this is also shown fairly easily to act in the correct way.\n\nThese each have fairly nice operational interpretations. The first is "Either do nothing, or rotate by $\spi$ around the $z$ axis, with equal probability." The second is "Measure $\ssigma_z$, collapsing in to an eigenstate of $\ssigma_z$, but don't look at the outcome." It's rather remarkable that these two procedures both lead to the same map!\n\n<<<\n**(e) This operation describes a measurement of $\ssigma_z$ on $A$ (sometimes it's called a "pre-measurement", because there is no collapse), in the sense that: (1) information about $A$ is transferred to $B$, where it can be extracted by measuring $B$ alone; and (2) the state of $A$ after the operation is a mixture of the eigenstates of $\ssigma_z$. Show that if we apply $U$ to the combined system, //then// measure $\ssigma_z$ on system $B$, the probabilities for the outcomes are exactly the same as if we had measured $\ssigma_z$ on system $A$ in the first place (i.e., before applying $U$). (5 points)\n<<<\n\nIf system $A$ starts in a pure state $\salpha\sket{0}+\sbeta\sket{1}$, then the probabilities for a $\ssigma_z$ measurement on $A$ //before// the interaction are\n\sbegin{eqnarray}\np_\suparrow &=& |\salpha|^2 \s\s\np_\sdownarrow &=& |\sbeta|^2.\n\send{eqnarray}\nFrom part (b), the joint state after the interaction is\n$$\sket{\spsi_f} = -i(\salpha\sket{\suparrow\suparrow} + \sbeta\sket{\sdownarrow\sdownarrow}).$$\nIf we want to calculate the probabilities of a $\ssigma_z$ measurement on $B$, we need to compute the reduced density matrix $\srho_B$. The easy way to do this is to observe that $\sket{\spsi_f}$ is symmetric with respect to swapping $A$ and $B$, and therefore $\srho_B = \srho_A = |\salpha|^2\sproj{\suparrow} + |\sbeta|^2\sproj{\sdownarrow}$ (also from part (b)). The probabilities for a $\ssigma_z$ measurement on $B$ //after// the interaction are therefore\n\sbegin{eqnarray}\np_\suparrow &=& |\salpha|^2 \s\s\np_\sdownarrow &=& |\sbeta|^2,\n\send{eqnarray}\nwhich is exactly what we were looking for.\n\n<<<\n5. ''Bonus problem for those who want more fun'' (i.e., this is not graded, but it //will// enhance your understanding)\nSuppose that in Problem 4, we applied the Hamiltonian for a shorter period of time, so $U = e^{-i(\spi/4)H}$. Everything else is the same.\n**(a) Write the resulting linear operation $\smathcal{E}:\sproj{\spsi_0}\sto\srho_f$ as a $4\stimes4$ "superoperator" matrix in the basis of Pauli operators.\n<<<\n\n''Note: as in the previous problem, I originally misstated the time here; it should have been $\spi/4$, not $\spi/2$.''\n\nRecall that \n$$H = \smat{\ssigma_x & 0 \s\s 0 & \sId} = \smat{0 & 1 & 0 & 0 \s\s 1 & 0 & 0 & 0 \s\s 0 & 0 & 1 & 0 \s\s 0 & 0 & 0 & 1},$$\nand so in this case\n$$U = e^{-i(\spi/4) H} = \smat{e^{-i(\spi/4)\ssigma_x} & 0 \s\s 0 & e^{-i(\spi/2)\sId}} = \smat{\sfr{1~}{\ssqrt2~} & -\sfr{i~}{\ssqrt2~} & 0 & 0 \s\s \sfr{-i~}{\ssqrt2~} & \sfr{1~}{\ssqrt2~} & 0 & 0 \s\s 0 & 0 & \sfr{1-i~}{\ssqrt2~} & 0 \s\s 0 & 0 & 0 & \sfr{1-i}{\ssqrt2~}}.$$\nActing on an initial state $\sket{\spsi_0} = \salpha\sket{\suparrow\sdownarrow} + \sbeta\sket{\sdownarrow\sdownarrow}$, this yields\n$$\sket{\spsi_f} = U\sket{\spsi_0} = \sfrac{1}{\ssqrt2}\sleft[ \salpha\sket{\suparrow\sdownarrow} -i\salpha\sket{\suparrow\suparrow} + (1-i)\sbeta\sket{\sdownarrow\sdownarrow}\sright],$$\nwhich, if we look closely, is just an equal superposition of the original state and the state that we //would// have gotten if we'd applied $H$ for the full time $\spi/2$!\n\nAnyway, we can calculate $\srho_f$ for the $A$ subsystem by tracing out $B$:\n\sbegin{eqnarray}\n\srho_f &=& \sTr_B[ \sproj{\spsi_f} ] \s\s\n&=& \sfrac{1}{2}\sleft[ |\salpha|^2\sproj{\suparrow} + |\salpha|^2\sproj{\suparrow} + 2|\sbeta|^2\sproj{\sdownarrow} + (1+i)\salpha^*\sbeta\sketbra{\sdownarrow}{\suparrow} + (1-i)\salpha\sbeta^*\sketbra{\suparrow}{\sdownarrow} \sright] \s\s\n&=& |\salpha|^2\sproj{\suparrow} + |\sbeta|^2\sproj{\sdownarrow} + \sfrac{e^{-i\spi/4}}{\ssqrt{2}}\salpha^*\sbeta\sketbra{\suparrow}{\sdownarrow} + \sfrac{e^{i\spi/4}}{\ssqrt{2}}\salpha^*\sbeta\sketbra{\sdownarrow}{\suparrow}\n\send{eqnarray}\nApplying this in particular to states of the form $\sket\spsi = \sfrac{1}{\ssqrt2}(\sket{\suparrow}+e^{i\sphi}\sket{\sdownarrow})$ (which includes all the eigenstates of $\ssigma_x$ and $\ssigma_y$), we get\n$$\sproj{\spsi} \sto e^{i(\spi/8)\ssigma_z}\sleft[ \sfrac{1}{\ssqrt{2}}\sproj{\spsi} + \sleft(1-\sfrac{1}{\ssqrt{2}}\sright)\sfrac{\sId}{2}\sright]e^{-i(\spi/8)\ssigma_z}.$$\nWhat does this //mean//? Well, two things are happening to the state $\srho_A$:\n# The off-diagonal elements (in the $\ssigma_z$ basis) get decreased by a factor of $1/\ssqrt{2}$, and\n# The resulting state is getting rotated around the $\ssigma_z$ axis by an angle of $\spi/4$.\nBy writing the Pauli operators as differences of their eigenstates (as in problem #4), we can work out explicitly what happens to $\sId,\ssigma_x,\ssigma_y,\ssigma_z$, although it's also implicit in the two points just noted:\n\sbegin{eqnarray}\n\smathcal{E}(\sId) &=& \sId \s\s\n\smathcal{E}(\ssigma_x) &=& \sfrac12(\ssigma_x + \ssigma_y) \s\s\n\smathcal{E}(\ssigma_y) &=& \sfrac12(\ssigma_y - \ssigma_x) \s\s\n\smathcal{E}(\ssigma_z) &=& \ssigma_z,\n\send{eqnarray}\nwhich lets us write $\smathcal{E}$ in the Pauli basis $\s{\sId,\ssigma_x,\ssigma_y,\ssigma_z\s}$ as:\n$$\smathcal{E} = \smat{1 & 0 & 0 & 0 \s\s 0 & \sfr12 & -\sfr12 & 0 \s\s 0 & \sfr12 & \sfr12 & 0 \s\s 0 & 0 & 0 & 1}.$$\n\n<<<\n**(b) After applying $U$, we measure $\ssigma_z$ on the $B$ system, obtaining "up" with probability $p_0$ and "down" with probability $p_1$. Determine what measurement we have effectively made on the $A$ system by finding positive operators $\s{\shat{E}_{0},\shat{E}_{1}\s}$ (on the $A$ system) such that\n$$p_k = \sTr[\sproj{\spsi_0}\shat{E}_k]$$\n<<<\n\nFirst, we need to determine the probabilities of getting "up" and "down" when we make the measurement on the $B$ system. The joint state after applying $U$ is\n\sbegin{eqnarray}\n\sket{\spsi_f} &=& \sfrac{1}{\ssqrt2}\sleft[ \salpha\sket{\suparrow\sdownarrow} -i\salpha\sket{\suparrow\suparrow} + (1+i)\sbeta\sket{\sdownarrow\sdownarrow}\sright] \s\s\n&=& \sfrac{1}{\ssqrt2}\sleft[ \ssqrt{2}\salpha\sket{\suparrow}\sotimes\sleft(\sfrac{\sket{\sdownarrow}-i\sket{\suparrow}}{\ssqrt{2}}\sright) + (1+i)\sbeta\sket{\sdownarrow}\sotimes\sket{\sdownarrow} \sright]\n\send{eqnarray}\nand to determine probabilities of measurements on $B$, we need to calculate $\srho_B = \sTr_A(\sproj{\spsi_f})$:\n\sbegin{eqnarray}\n\srho_B &=& \sTr_A[ \sproj{\spsi_f} ] \s\s\n&=& \sleft[ |\salpha|^2\sfrac{\sleft(\sproj{\sdownarrow}+\sproj{\suparrow}-i\sketbra{\suparrow}{\sdownarrow}+i\sketbra{\sdownarrow}{\suparrow}\sright)}{2} + \sbeta^2\sproj{\sdownarrow} \sright]\n\send{eqnarray}\nThe probability of "up" is\n$$p_0 = \sbraopket{\suparrow}{\srho_B}{\suparrow} = \sfrac{|\salpha|^2}{2}$$\nand the probability of "down" is\n$$p_1 = \sbraopket{\sdownarrow}{\srho_B}{\sdownarrow} = \sfrac{|\salpha|^2}{2} + |\sbeta|^2 = 1-\sfrac{|\salpha|^2}{2},$$\nwhere the last equality follows because $|\salpha|^2 + |\sbeta|^2=1$, from normalization.\n\nNow, recall that the //initial// density matrix for the $A$ subsystem is\n$$\srho_0 = \smat{ |\salpha|^2 & \salpha\sbeta^* \s\s \salpha^*\sbeta & |\sbeta|^2 },$$\nand so if we define\n\sbegin{eqnarray}\n\shat{E}_0 &=& \sfrac{\sproj{\suparrow}}{2} = \smat{1/2 & 0 \s\s 0 & 0} \s\s\n\shat{E}_0 &=& \sfrac{\sproj{\suparrow}}{2} + \sproj{\sdownarrow} = \smat{1/2 & 0 \s\s 0 & 1},\n\send{eqnarray}\nthen this satisfies $p_k = \sTr[\sproj{\spsi_0}\shat{E}_k]$.\n\nWhat kind of a measurement is this? It's a measurement of $\ssigma_z$ that only succeeds half the time. When it fails, it just reports "down", regardless of the actual state. This has a very nice interpretation in terms of the "pre-measurement" process. By applying the interaction Hamiltonian $H$ for time $\spi/2$, we executed a //controlled-NOT// gate between the $A$ and $B$ systems, thus recording information about $\ssigma_z^{(A)}$ in system $B$. However, if we only let the interaction go for time $\spi/4$, we haven't completely recorded the information! Essentially, we end up in a superposition of "information was successfully recorded" and "no information was recorded". Measuring $B$ collapses this superposition, and 50% of the time, we get the right answer... but 50% of the time, we find that no interaction actually happened, and $B$ just stayed in the $\sket{\sdownarrow}$ state regardless of $A$'s state. Of course, if we get the "down" outcome, we don't actually know which possibility happened -- it might be that $A$ was just in the "down" state, or it might be that the measurement failed!\n
The [[quantum harmonic oscillator]] is one of the most conceptually and practically useful physical systems in quantum theory. The lowest-energy eigenstate of the harmonic oscillator Hamiltonian, usually denoted $\sket{0}$ because it is the zero-eigenvalue eigenstate of the //number operator// $\shat{N} = \shat{a}^\sdagger\shat{a}$, has a variety of useful and interesting properties.\n\n!!! Defining coordinates\n\nIf the Hamiltonian of the harmonic oscillator we're interested in is\n$$H = \sfrac{\somega}{2}\sleft(\sfrac{p_0^2}{m\somega} + m\somega x_0^2\sright)$$\nthen for convenience sake we choose new coordinates\n\sbegin{eqnarray}\nx &\sequiv& \ssqrt{m\somega} x_0 \s\s\np &\sequiv& \sfrac{1}{\ssqrt{m\somega}} p_0 \s\s\n&\sRightarrow~& ~H = \sfrac{\somega}{2}( p^2 + x^2 )\n\send{eqnarray}\n\n!!! Properties of the harmonic oscillator ground state\n[>img[images/GroundStateInPhaseSpace.png]]\n\nThe wavefunction for the harmonic oscillator ground state is\n$$\sbraket{x}{0} \sequiv \spsi_0(x) = (\spi\shbar)^{-1/4} e^{-x^2/2/\shbar}$$\nThe state is localized near the origin of phase space,\n\sbegin{eqnarray}\n\sexpect{x}~ &=& \sbraopket{0}{\shat{x}}{0} = 0 \s\s\n\sexpect{p}~ &=& \sbraopket{0}{\shat{p}}{0} = 0 \s\s\n\sDelta x^2 = \sexpect{x^2}~ &=& \sbraopket{0}{\shat{x}^2}{0} = \sfrac{\shbar}{2} \s\s\n\sDelta p^2 = \sexpect{p^2}~ &=& \sbraopket{0}{\shat{p}^2}{0} = \sfrac{\shbar}{2}.\n\send{eqnarray}\nIt is invariant under rotations in phase space; for any angle $\stheta$,\n\sbegin{eqnarray}\n\sexpect{\shat{x}\scos\stheta + \shat{p}\ssin\stheta}~ &=& 0 \s\s\n\sexpect{\sleft(\shat{x}\scos\stheta + \shat{p}\ssin\stheta\sright)^2}~ &=& \sfrac{\shbar}{2} \s\s\n\send{eqnarray}\nand the momentum-space wavefunction for $\sket{0}$ is\n$$\stilde{\spsi}_0(p) \sequiv \sbraket{p}{0} = (\spi\shbar)^{-1/4} e^{-p^2/2/\shbar}$$\n\nThe variances $\sDelta x^2$ and $\sDelta p^2$ for the $\sket{0}$ state saturate the Heisenberg uncertainty relation\n$$\sDelta x^2\sDelta p^2 ~\sgeq \sfrac14\sleft|\sexpect{[\shat{x},\shat{p}]}\sright|^2 = \sleft(\sfrac{\shbar}{2}\sright)^2$$\nSo the ground state of the harmonic oscillator is as close as possible to having well-defined position and momentum at the same time. We call it a "minimum-uncertainty state". If the $\sket{0}$ state is [[translated in phase space|phase space translation operators]], by $x$ and $p$, this yields a [[coherent state]] $\sket{\salpha}$ (where $\salpha = x+ip$).\n\nFor free field theories -- in particular, the quantized theory of the electromagnetic field -- each normal mode of the field behaves like a harmonic oscillator. For the EM field, the coordinate is $\svec{A}$ and the momentum is $\svec{E}$. The ground state in this case is called the //vacuum state//, because it corresponds to the minimum energy (and therefore the minimum number of excitations in the field -- e.g. photons).
A Hermitian matrix $H$ is one that satisfies $H^\sdagger = H$. Quantum observables are represented by Hermitian operators. Hermitian operators are always [[normal]], so they have complete orthogonal sets of eigenvectors. Furthermore, they have //real// eigenvalues, so they are sometimes called "real operators", e.g. by Dirac.\n\nFor finite-dimensional matrices, being Hermitian is equivalent to being [[self-adjoint]], and the equivalence extends to [[bounded]] operators on arbitrary (e.g., infinite-dimensional) Hilbert spaces $\smathcal{H}$. Unbounded operators, however, are a bit more complicated (see [[self-adjoint]] for more details). The key points are:\n* The infinite-dimensional generalization of $H^\sdagger = H$ is \n$$\sbraket{H\sphi}{\spsi} = \sbraket{\sphi}{H\spsi}$$\n* An operator satisfying this condition //for all $\sket{\spsi}$, $\sket\sphi$ in its domain// is called ''symmetric''. However, its domain need not be all of $\smathcal{H}$, and it may not be bounded.\n* A symmetric operator is called Hermitian if and only if it is also bounded.\n* Finally, if a symmetric operator is defined everywhere (its domain is all of $\smathcal{H}$), then it is called [[self-adjoint]], and the Hellinger-Toeplitz Theorem states that self-adjoint operators are also bounded (and therefore Hermitian).
/%\n|Name|HideTags|\n|Source|http://www.TiddlyTools.com/#HideTiddlerTags|\n|Version|0.0.0|\n|Author|Eric Shulman - ELS Design Studios (edited by Garrett)|\n|License|http://www.TiddlyTools.com/#LegalStatements <<br>>and [[Creative Commons Attribution-ShareAlike 2.5 License|http://creativecommons.org/licenses/by-sa/2.5/]]|\n|~CoreVersion|2.1|\n|Type|script|\n|Requires|InlineJavascriptPlugin|\n|Description|hide a note's tagged/tagging/references display elements|\n\nUsage: <<note HideTags>>\n\n%/<script>\n var t=story.findContainingNote(place);\n if (t && t.id!="noteHideTags")\n for (var i=0; i<t.childNodes.length; i++)\n {if (hasClass(t.childNodes[i],"tagging")||hasClass(t.childNodes[i],"tagged"))\n t.childNodes[i].style.display="none";\n if (hasClass(t.childNodes[i],"references"))\n t.childNodes[i].style.display="none";\n if (hasClass(t.childNodes[i],"viewer"))\n {t.childNodes[i].style.height="641px";}}\n</script>
A Hilbert space $\smathcal{H}$ is a vector space with two additional features:\n* It has an inner product. For any $\spsi,\sphi\sin\smathcal{H}$, $\sbraket{\spsi}{\sphi}$ is defined.\n* It is (as a [[metric space]]) //complete//, which means that every [[Cauchy sequence]] $[\spsi_1,\spsi_2,\sldots]$ in $\smathcal{H}$ converges to an element $\spsi$ of $\smathcal{H}$. The metric used here is $d(\spsi,\sphi) = ||\spsi-\sphi|| = \sbraket{\spsi-\sphi}{\spsi-\sphi} = \sbraket{\spsi}{\spsi}+\sbraket{\sphi}{\sphi}-2\sRe{\sbraket{\spsi}{\sphi}}$.\n\nExamples of non-complete sets include the rational numbers (which are [[dense]] on the real line, but nonetheless have holes within them), and open balls (which do not include their boundaries).\n\nAll finite-dimensional vector spaces over the real and complex numbers are Hilbert spaces (if we take the natural inner product on them). Hilbert spaces were defined specifically to include those infinite-dimensional spaces that behave somewhat similarly to finite-dimensional vector spaces (especially the Euclidean space $\sreals^n$).\n\nAn example of an infinite-dimensional Hilbert space is the set of all square-integrable complex functions on the real line, $L^2(\sreals)$ (see [[L2(R)]]). Its associated inner product is\n$$\sbraket{f}{g} \sequiv \sint_{-\sinfty}^{\sinfty}{f^*(x)g(x)\sdiff x},$$\nwhich is easily shown to (a) be an inner product, and (b) be defined for all square-integrable functions thanks to the [[Schwarz Inequality]] (whose proof relies only on $\sbraket{\scdot}{\scdot}$ being an inner product).\n\nAn example of a vector space that is //not// a Hilbert space is the set of all complex functions $f(x)$ (i.e., $f:\sreals\srightarrow\scomplex$). It has no natural inner product. In fact, this is a pretty ill-behaved vector space, since it doesn't even have a norm.\n\nAn example of a vector space that //does// have a norm, but not an inner product, is the set of all integrable functions $f(x)$ for which \n$$\sint_{-\sinfty}^{\sinfty}{|f(x)|\sdiff x}$$\nis defined. This space, called $L^1(\sreals)$, is a //Banach space// (a complete normed vector space) because $\sint_{-\sinfty}^{\sinfty}{|f(x)|\sdiff x}~$ is a valid norm (i.e., measure of the vector's length).\n\nAn example of a vector space with an inner product, which is nonetheless //not// a Hilbert space, is the set of all //smooth// $\smathcal{L^2}$ functions. Also referred to as $C^\sinfty$ functions, or infinitely differentiable functions, the smooth functions have the property that all their derivatives $\spd{^nf~}{x^n~}$ are well-defined. For instance, $e^{-x^2}$ is smooth. The smooth functions are not a Hilbert space because they are not //complete//; there exists an infinite [[Cauchy sequence]] of smooth square-integral functions, for instance, that converges to the nonsmooth function\n$$f(x) = e^{-|x|}.$$\n\nThis demonstrates an important fact: not every subspace of a Hilbert space is itself a Hilbert space. The smooth functions are a subspace because they are closed under //finite// linear combination, but not under infinite linear combination -- i.e., they are not complete.
//Context:// [[Lecture 10]]\n\nHilbert-Schmidt space is the space of [[bounded]] operators on a Hilbert space. Given a Hilbert space $\smathcal{H}$, we denote the operator space by $\smathcal{B(H)}$. (//Note: all operators on finite dimensional Hilbert spaces $\smathcal{H} = \scomplex^d$ are bounded.//) By defining the Hilbert-Schmidt inner product on operators,\n$$\shat{A}\scdot\shat{B} \sequiv \sTr\sleft[\shat{A}^\sdagger\shat{B}\sright]$$\nwe ensure that $\smathcal{B(H)}$ is, itself, a Hilbert space. Hilbert-Schmidt space is particularly useful in quantum mechanics because most of the quantities that we calculate are Hilbert-Schmidt inner products. Examples include: probabilities calculated via [[Born's Rule]], [[expectation value]]s, [[Husimi functions]], [[Wigner functions]]...\n\sbegin{eqnarray}\np(j) &=& \sleft|\sbraket{\sphi_j}{\spsi}\sright|^2 = \sTr\sleft[ \sproj{\sphi_j}\sproj{\spsi} \sright] \s\s\n&\smathrm{or}&\s\s\np(j) &=& \sTr\sleft[ \shat{E}_j \shat{\srho} \sright] \s\s\n\sexpect{\shat{A}} &=& \sTr\sleft[ \shat{A} \shat{\srho} \sright] \s\s\nQ(x,p) &=& \sfrac{1~}{2\spi\shbar~}\sTr\sleft[ \sproj{\salpha} \shat{\srho}\sright] \s\s\nW(x,p) &=& \sfrac{1~}{\spi\shbar~}\sTr\sleft[ \shat{A}_{x,p} \shat{\srho}\sright]\n\send{eqnarray}\n\n!!! Deriving Hilbert-Schmidt space\n\nMatrices form a vector space. This should be sort of obvious -- you can add them, and you can multiply them by scalars, and you get back another matrix. In fact, linear operators always form a vector space. For the quantum operators that we're dealing with, this vector space is called Hilbert-Schmidt space, and usually denoted $\smathcal{B(H)}$, short for "[[bounded]] operators on $\smathcal{H}$". For finite dimensional Hilbert spaces, boundedness is a total red herring -- every operator is bounded -- but in infinite-dimensional spaces like $L^2(\sreals)$, there are [[unbounded]] operators. ($\smathcal{L(H)}$ is also used, short for "linear operators on $\smathcal{H}$, to denote the vector space of //all// operators on $\smathcal{H}$.)\n\nOne easy way to "vectorize" a matrix is to take a matrix like\n$$\smat{ a & b & c \s\s d & e & f \s\s g & h & k }$$\nand just stack all the columns on one another:\n$$\smat{ a \s\s b \s\s c \s\s d \s\s e \s\s f \s\s g \s\s h \s\s k }.$$\nBang, we've got a vector. Now, if we were to take the inner product of two matrices\n$$A = \smat{ a & b \s\s c & d}; B = \smat{ e & f \s\s g & h},$$\naccording to this vectorization, we would get\n$$\svec{A}^\sdagger\scdot\svec{B} = \smat{a^* & b^* & c^* & d^*}\smat{e \s\s f \s\s g \s\s h } = a^*e + b^*f + c^*g + d^*h$$.\nNow, here is something //really// cool:\n$$\svec{A}^\sdagger\scdot\svec{B} = \sTr[A^\sdagger B],$$\nwhere in the trace, the matrices have not been vectorized. Go ahead, prove to yourself that it works!\n\n//Exercise:// Prove that if the elements of $A$ are $A_{i,j}$ and those of $B$ are $B_{i,j}$, then $\sTr[A^\sdagger B] = \ssum_{i,j}{A_{i,j}^*B_{i,j}}$.\n\nThis is called the //Hilbert-Schmidt inner product// between $A$ and $B$. Using this inner product, we can find a //basis// of matrices for $\smathcal{B(H)}$ -- if the Hilbert space $\smathcal{H}$ is $d$-dimensional, then the Hilbert-Schmidt space is $d^2$-dimensional, and therefore we need $d^2$ basis matrices.\n\n//Exercise:// Show that the vectorization given above is equivalent to using the basis $\s{B_{i,j}:\s i,j=1\sldots d\s}$, where $B_{i,j}$ is a matrix with zeros everwhere except for a 1 in row $i$, column $j$.\n\nNow, Hilbert-Schmidt space is a complex vector space -- given any matrix, you can multiply it by a complex number and get another matrix. The //Hermitian// matrices, however, form a //real// vector space. A Hermitian matrix multiplied by a non-real number is no longer Hermitian. And, indeed, the Hilbert-Schmidt inner product between any two Hermitian matrices is a real number, so if we pick a basis of Hermitian matrices $\s{H_k:\s k=1\sldots d^2\s}$, the coefficients of any other Hermitian matrix $J$ in that basis will be real.\n\n!!! Hilbert-Schmidt space and the Bloch sphere\n\nLet's apply this theory to the $2\stimes 2$ operators on a 2-dimensional Hilbert space. As a basis, we will choose the [[Pauli matrices]], $\s{\sId, \ssigma_x, \ssigma_y, \ssigma_z\s}$.\n\n//Exercise:// Show that $\s{\sId, \ssigma_x, \ssigma_y, \ssigma_z\s}$ form an orthogonal (though not quite normalized) basis according to the Hilbert-Schmidt inner product.\n\nSince the Pauli operators form a basis, we can write any operator $M$ as\n$$M = c_1\sId + c_x\ssigma_x + c_y\ssigma_y + c_z\ssigma_z.$$\nand the coefficients are given by\n$$\svec{M} = \smat{c_1\s\sc_x\s\sc_y\s\sc_z} = \sfrac12\smat{\sTr(M)\s\s \sTr(\ssigma_x M) \s\s \sTr(\ssigma_y M) \s\s \sTr(\ssigma_z M) }$$\nNow, suppose that $M$ is a density matrix $\srho$. Then (recalling that $\sTr\srho=1$),\n$$\svec{M} = \sfrac12\smat{1\s\s \sexpect{\ssigma_x} \s\s \sexpect{\ssigma_y} \s\s \sexpect{\ssigma_z}},$$\nwhich is just the Bloch vector, with an extra "1" stuck on the top of it! In other words, the [[Bloch sphere]] is really a construct in Hilbert-Schmidt space -- or, more precisely, the //real//-valued vector space of Hermitian operators. Actually, because all states have trace 1, there's the "1" in the identity component of the Hilbert-Schmidt vector for any $\srho$, which makes the $\sreals^3$ space in which the Bloch sphere sits an //affine hyperplane// rather than a vector space, but this is not a terribly critical point.\n\nWe can see, via this argument, that interior points of the Bloch sphere are density matrices. Recall that a mixed state's density matrix\n$$\srho = p_1\sproj{\spsi_1} + p_2\sproj{\spsi_2}$$\nwhere $p_1 + p_2 = 1$. Since these are all matrices, $\srho$ is a convex linear combination of pure state projectors, and it is exactly the pure state projectors whose Bloch or Hilbert-Schmidt vectors correspond to the surface of the Bloch sphere! Thus, every convex combination of them is a point inside the sphere -- and every point inside the sphere can be obtained as a convex combination of the pure states on the surface (i.e., as a density matrix).\n\nFinally, a couple of useful points about distances in Hilbert-Schmidt space.\n* The //Hilbert-Schmidt metric// is a natural distance measure on states, defined as \n$$D_{HS} = \ssqrt{\sfrac12\sTr[ (\srho-\ssigma)^2\s,]\s,}$$\n* This is precisely the Euclidean metric in Hilbert-Schmidt space, so it's also the natural Euclidean distance between two points on the Bloch sphere.\n* If $\srho$ and $\ssigma$ are pure states, then\n$$D_{HS} = \ssqrt{\sfrac12\sTr[ (\sproj{\spsi}-\sproj{\sphi})^2]} = \ssqrt{ 1-|\sbraket{\spsi}{\sphi}|^2 }$$\nThis is exactly related to the transition probability ([[Born's Rule]]) between the two states -- it's 1 for orthogonal states, and 0 for identical states. For general mixed states, however, it doesn't have this meaning (e.g., perfectly distinguishable mixed states can have arbitrarily small HS distance when the dimension of the Hilbert space is large).\n\n
The answers to these problems are due, via email to ta@am473.ca, by Wednesday October 1 at 11:30 AM. Please follow the [[Homework Policy]] for this class.\n\n1. The Nature of States (17 pts)\n* (a) Look up (in a dictionary, encyclopedia, Teh Interwebs, etc.) the words "epistemic" and "ontological" (see also "ontic"). Based on this research, write 1-5 sentence definitions of the two words, as used in the context "This state of a physical system is an {ontological/epistemic} state.". ''Use your own words.'' You can certainly paraphrase other sources, but if a Google search for the text of your answer turns up a [nearly] exact match, I'll be forced to assume that you plagiarized. (5 pts)\n* (b) In each of the following sentences, state (& explain briefly) whether the word "state" is being used in the sense of "ontological state" or "epistemic state". (12 pts)\n** (i) "The state of this coin is `heads facing up'."\n** (ii) "The state of this classical particle moving along a line is: $\sleft\s{x=2.0\spm0.2\smathrm{\s cm;\s }p=0\spm0.1\smathrm{\s }\sfrac{\smathrm{g}\scdot\smathrm{cm}}{\smathrm{s}}\sright\s}~$."\n** (iii) "The state of the $10^{23}$ nitrogen atoms in this 1 liter box is: they are in thermal equilibrium at exactly 1 atmosphere of pressure and exactly 273 degrees Kelvin."\n** (iv) "The state of this silver atom's angular momentum is $\sket\spsi = \smat{ 1 \s\s 0 }~$ in the basis $\s{\sket{\suparrow},\sket{\sdownarrow}\s}$ defined by measuring $J_z$ with a Stern-Gerlach."\n** (v) "The state of this silver atom's angular momentum is $\sket\spsi = \smat{ \sfrac{1}{\ssqrt2}~ \s\s \sfrac{1}{\ssqrt2}~ }$ in the same basis as (iv)."\n** (vi) "The state of this silver atom's angular momentum is $\srho = \sfrac12\sleft(\sproj{\suparrow} + \sproj{\sdownarrow}\sright)~$."\n\n2. Probabilistic descriptions of physical systems: (25 pts)\nSuppose I invent a 3-sided die, or "Trie". Its sides are labeled "1", "2", and "3", and we will be interested //only// in the side that is facing up when it is thrown. I manufacture four different versions -- called Alpha, Beta, Gamma, and Delta -- each with a different bias. Extensive testing shows that, when each model is thrown many times in the same way:\n** When Alpha is thrown, each side shows up equally often.\n** When Beta is thrown, "1" appears three times as often as "3", and "2" appears twice as often as "3".\n** When Gamma is thrown, "1" and "2" appear equally often and "3" never shows up.\n** When Delta is thrown, "3" appears every time.\n* (a) For each of the four models, suppose that I throw it so that you cannot see how it lands. Write down (for each model) the probabilistic state describing your knowledge of how it lies. (4 pts)\n* (b) Draw and label the probability simplex for the Trie, and plot each of the four states from (a) on it. (3 pts)\n* (c) For each of the states in (a): //first//, state whether it is pure or mixed; //second//, either write down two different convex decompositions of the state (i.e., write it as a convex combination of 2 other states) or explain why this cannot be done. (4 pts)\n* (d) Let the Trie's sample space be indexed by a number $n\sin[1,2,3]$. Write down two different observables for the Trie, at least one of which is complete. (4 pts)\n* (e) For the complete observable you specified in (d), write down the corresponding observation as a set of indicator functions. (5 pts)\n* (f) Suppose that your procedure for performing the observation in (e) is flawed (perhaps you need an eye exam?). Half the time, it works, but half the time you get a uniformly random result (that bears no connection with the actual configuration of the Trie!), and you don't know when it fails. Write down this observation as a set of indicator functions. (5 pts)\n\n3. Bayes' Rule, decisions, and convex combination (28 pts)\n* (a) For each of the models of Trie described in Problem 2, suppose that the corresponding Trie is thrown, and then you make the observation described in 2(f). What is -- in each of the four cases -- the probability distribution of the outcomes? (4 pts)\n* (b) In each of the cases of (a), suppose you get the result corresponding to "1". Write down (for each model) the probabilistic state describing your knowledge of how the Trie lies //after// the observation. (4 pts)\n* (c) For each of the cases in (a), you are asked to make your best guess as to how the Trie lies. What is the probability that you guess correctly, both: (i) //before// the observation, and (ii) //after// the observation? For which models did the observation help you? (//Note: ignore part (b) here; do __not__ assume that you get result "1"//). (4 pts)\n* (d) I fill an urn with 100 Tries -- 50 "Beta" models and 50 "Gamma" models -- then pull one out at random and throw it so that you cannot see either which //kind// of Trie it is, or how it falls. Write down a probabilistic state describing your knowledge of how it lies. (3 pts)\n* (e) You are asked to make an observation on the Trie thrown in (d), and then to decide whether I selected a Beta or a Gamma model. Write down an observation (as a set of indicator functions) //with only two outcomes// that achieves the highest achievable probability of guessing correctly, and state the probability of guessing correctly both before and after the observation. (6 pts)\n* (f) Generalize part (e) to an arbitrary $n$-state system, where I may have chosen to prepare either state $\svec{p} = \s{p_1\sldots p_n\s}$ or $\svec{q} = \s{q_1\sldots q_n\s}$, with equal //prior probabilities// for $\svec{p}$ and $\svec{q}$. You can perform any observation on a single sample of the unknown state; show that the highest probability of guessing correctly is given by $P_{correct} = \sfrac12 + \sfrac14\ssum_n{|p(n)-q(n)|}~$. (7 pts)\n\n4. Diagonalizing observables and unitary matrices. (30 pts)\nSuppose $A$ is a Hermitian matrix (i.e., a representation of an operator $\smathbf{A}$ on a finite-dimensional Hilbert space). Let us call the basis in which $A$ is written $\s{\sket{1},\sket{2},\sldots\sket{n}\s}$, so $A_{ij} \sequiv \sbraopket{i}{\smathbf{A}}{j}$.\n* (a) The Spectral Theorem implies that $A$ has a complete set of eigenkets $\s{\sket{a_i}\s}$ with eigenvalues $\s{a_i\s}$. Write down a matrix $U$ such that $UAU^\sdagger$ is diagonal. (5 pts)\n* (b) Prove that $U$ is //unitary//. (3 pts)\n* (d) Write $UAU^\sdagger$ in Dirac notation. Your answer should make it clear what the diagonal elements are. (5 pts) \n* (c) The //commutator// of two matrices $A$ and $B$ is defined as $[A,B] = AB-BA$. Prove that $U$ commutes with its adjoint -- i.e., $[U,U^\sdagger]=0$. (3 pts)\n* (d) A matrix that commutes with its adjoint is called //normal//, and the Spectral Theorem can be proved for all normal matrices. Use this to prove that the eigenvalues of $U$ are $\s{e^{i\stheta_k}\s}$, where the $\stheta_k$ are real numbers. (6 pts)\n* (e) The //infinity norm// of an operator $X$, denoted $||X||_\sinfty$, is the absolute value of $X$'s largest eigenvalue. It is a good measure of the overall magnitude of an operator (e.g., if $||X||_\sinfty=0$, then $X=0$). Prove the following theorem: For any real number $\sepsilon>0$, there is an integer $n>0$ such that $||U^n-\sId||_\sinfty < \sepsilon$. (8 pts)\n
<<<\n1. The Nature of States\n* (a) Look up (in a dictionary, encyclopedia, Teh Interwebs, etc.) the words "epistemic" and "ontological" (see also "ontic"). Based on this research, write 1-5 sentence definitions of the two words, as used in the context "This state of a physical system is an {ontological/epistemic} state.". ''Use your own words.'' You can certainly paraphrase other sources, but if a Google search for the text of your answer turns up a [nearly] exact match, I'll be forced to assume that you plagiarized.\n<<<\n\nEpistemic: Relating to knowledge. An //epistemic state// is one that describes knowledge about a physical system, generally someone's knowledge (which may well be incomplete) about that system's properties and how it will behave.\n\nOntic or Ontological: Relating to reality. An //ontological state// is one describing how a system really is -- what its properties are, independent of whether anybody knows them or not.\n\n<<<\n* (b) In each of the following sentences, state (& explain briefly) whether the word "state" is being used in the sense of "ontological state" or "epistemic state".\n** (i) "The state of this coin is `heads facing up'."\n<<<\n\nThis is both an ontological state (it says what the coin's property //is//) and an epistemic state (it could describe my knowledge, if I knew everything about the coin). Full credit for "ontological" or "both".\n\n<<<\n** (ii) "The state of this classical particle moving along a line is: $\sleft\s{x=2.0\spm0.2\smathrm{\s cm;\s }p=0\spm0.1\smathrm{\s }\sfrac{\smathrm{g}\scdot\smathrm{cm}}{\smathrm{s}}\sright\s}\s,$."\n<<<\n\nThis is an epistemic state. Classical particles have well-defined positions and momenta, but this state describes uncertainty and therefore must be someone's knowledge of the particle.\n\n<<<\n** (iii) "The state of the $10^{23}$ nitrogen atoms in this 1 liter box is: they are in thermal equilibrium at exactly 1 atmosphere of pressure and exactly 273 degrees Kelvin."\n<<<\n\nThis is an epistemic state because it describes vast uncertainty about the atoms' individual degrees of freedom. Had I said "..of the gas in this box..." it would have been ambiguous, because pressure, volume, and temperature might be taken as a complete specification of the rather vague system "gas" -- but they do not specify the true, detailed microstate of the $10^{23}$ atoms. Thermodynamic states are //always// epistemic -- this is why it's called "statistical mechanics".\n\n<<<\n** (iv) "The state of this silver atom's angular momentum is $\sket\spsi = \smat{ 1 \s\s 0 }~$ in the basis $\s{\sket{\suparrow},\sket{\sdownarrow}\s}$ defined by measuring $J_z$ with a Stern-Gerlach."\n<<<\n\nAmbiguous. There's no consensus on whether quantum pure states are epistemic or ontological! Full credit for anything but a blank answer (the point was to make you think about it...).\n\n<<<\n** (v) "The state of this silver atom's angular momentum is $\sket\spsi = \smat{ \sfrac{1}{\ssqrt2}\s, \s\s \sfrac{1}{\ssqrt2}\s, }$ in the same basis as (iv)."\n<<<\n\nAmbiguous for the same reason as above. However, your answer here must agree with the answer to (iv) to get credit -- these are both pure states and there is no difference in their status.\n\n<<<\n** (vi) "The state of this silver atom's angular momentum is $\srho = \sfrac12\sleft(\sproj{\suparrow} + \sproj{\sdownarrow}\sright)~$."\n<<<\n\nThis is an epistemic state. Quantum mixed states are, like classical probabilistic states, convex mixtures of pure states, and represent uncertainty as to which pure state describes the system. However, if you //explicitly// argued that the state could be ontological //if// the silver atom was known to be entangled with another system (something we haven't covered yet!), then you get full credit.\n\n<<<\n2. Probabilistic descriptions of physical systems:\nSuppose I invent a 3-sided die, or "Trie". Its sides are labeled "1", "2", and "3", and we will be interested //only// in the side that is facing up when it is thrown. I manufacture four different versions -- called Alpha, Beta, Gamma, and Delta -- each with a different bias. Extensive testing shows that, when each model is thrown many times in the same way:\n** When Alpha is thrown, each side shows up equally often.\n** When Beta is thrown, "1" appears three times as often as "3", and "2" appears twice as often as "3".\n** When Gamma is thrown, "1" and "2" appear equally often and "3" never shows up.\n** When Delta is thrown, "3" appears every time.\n* (a) For each of the four models, suppose that I throw it so that you cannot see how it lands. Write down (for each model) the probabilistic state describing your knowledge of how it lies.\n<<<\n\nI will write each state as a probability vector; other representations are permissible.\n$$ \svec{P}_\salpha = \smat{\sfrac13 \s\s \sfrac13 \s\s \sfrac13};\s \s \svec{P}_\sbeta = \smat{\sfrac12 \s\s \sfrac13 \s\s \sfrac16};\s \s \svec{P}_\sgamma = \smat{\sfrac12 \s\s \sfrac12 \s\s 0};\s \s \svec{P}_\sdelta = \smat{0 \s\s 0 \s\s 1} $$\n\n<<<\n* (b) Draw and label the probability simplex for the Trie, and plot each of the four states from (a) on it.\n<<<\n\n[img[images/HW2/HW2Sols-Simplex.png]]\n\n<<<\n* (c) For each of the states in (a): //first//, state whether it is pure or mixed; //second//, either write down two different convex decompositions of the state (i.e., write it as a convex combination of 2 other states) or explain why this cannot be done.\n<<<\n$\svec{P}_\salpha$, $\svec{P}_\sbeta$, and $\svec{P}_\sgamma$ are all mixed; $\svec{P}_\sdelta$ is pure.\nThere are many ways to write each of the mixed states as a convex combination of other states. A convex decomposition with two terms must be of the form $\svec{P} = p\svec{Q}_1 + (1-p)\svec{Q}_2$, where $\svec{Q}_1$ and $\svec{Q}_2$ are valid probabilistic states. Here are some examples:\n* $\svec{P}_\salpha = \sfrac23\svec{P}_\sgamma + \sfrac13\svec{P}_\sdelta = \sfrac23\svec{P}_\sbeta + \sfrac13\smat{0\s\s \sfrac13 \s\s \sfrac23}$\n* $\svec{P}_\sbeta = \sfrac12\svec{P}_\salpha + \sfrac12\smat{\sfrac23 \s\s \sfrac13 \s\s 0} = \sfrac23\svec{P}_\sgamma + \sfrac13\smat{\sfrac12\s\s0\s\s \sfrac12}$\n* $\svec{P}_\sgamma = \sfrac12\smat{1\s\s 0\s\s 0}+\sfrac12\smat{0\s\s 1\s\s 0} = \sfrac14\smat{1 \s\s 0\s\s 0}+\sfrac34\smat{\sfrac13\s\s \sfrac23 \s\s 0}$\nHowever, $\svec{P}_\sdelta$ is a pure state, which means it cannot be written as a convex combination of any states //other// than itself. Since $\svec{P}_\sdelta$ itself is the only probabilistic state which assigns zero probability to "1" and "2", any convex combination involving any other state would have to assign nonzero probability to "1" or "2", and could not therefore yield $\svec{P}_\sdelta$. //Note: it is sufficient for full credit to observe that $\svec{P}_\sdelta$ is pure; the fuller explanation is not necessary.//\n\n<<<\n* (d) Let the Trie's sample space be indexed by a number $n\sin[1,2,3]$. Write down two different observables for the Trie, at least one of which is complete.\n<<<\n\nAgain, there are very many possibilities. Complete observables include $n$, $n^2$, $e^n$, etc. Incomplete observables include $1$, $n\smathrm{\s mod\s }2$, and $(-1)^n$.\n\n<<<\n* (e) For the complete observable you specified in (d), write down the corresponding observation as a set of indicator functions.\n<<<\n\nAny complete observable assigns different values to each value of $n$, and therefore the corresponding observation is the one that resolves each state perfectly. The indicator functions could be written using Kronecker delta notation as $\s{\sdelta_{n,1}, \sdelta_{n,2}, \sdelta_{n,3}\s}$ or as vectors in the dual space to probability space (//Note: -1 point if you write them as probability vectors; this is mostly right but misses the deep point that indicator functions live in the dual space//),\n$$\sleft\s{\smat{1 & 0 & 0}, \smat{0 & 1 & 0}, \smat{0 & 0 & 1}\sright\s},$$\nor in any other way of representing the correct set of $I_j(n)$ indicator functions.\n\n<<<\n* (f) Suppose that your procedure for performing the observation in (e) is flawed (perhaps you need an eye exam?). Half the time, it works, but half the time you get a uniformly random result (that bears no connection with the actual configuration of the Trie!), and you don't know when it fails. Write down this observation as a set of indicator functions.\n<<<\n\nI will write the indicator functions as vectors in the probability dual space:\n$$\sleft\s{\smat{\sfrac23 & \sfrac16 & \sfrac16}, \smat{\sfrac16 & \sfrac23 & \sfrac16}, \smat{\sfrac16 & \sfrac16 & \sfrac23}\sright\s}.$$\nThat these are the correct indicator functions follows from the observation that //if// the true state is $n$, then outcome "$n$" occurs with probability $\sfrac23$, and each of the others occurs with probability $\sfrac16$. Thus, the measurement "works" $\sfrac12$ of the time, but we get the right answer by random accident $\sfrac16 = \sfrac13\scdot\sfrac12$ of the time too. //Note: $\sfrac12$ credit if you wrote $\smat{\sfrac12 & \sfrac14 & \sfrac14}$ and permutations, thus getting the right idea but the wrong math.//\n\n<<<\n3. Bayes' Rule, decisions, and convex combination\n* (a) For each of the models of Trie described in Problem 2, suppose that the corresponding Trie is thrown, and then you make the observation described in 2(f). What is -- in each of the four cases -- the probability distribution of the outcomes?\n<<<\nI will label the the outcome by "j", and write the probability distribution of the outcomes in each case as a probability vector:\n* Alpha: $P_{obs}(j) = \svec{I}_j\scdot\svec{P}_\salpha$, so $\svec{P}_{obs} = \smat{\sfrac13 \s\s \sfrac13 \s\s \sfrac13}$.\n* Beta: $\svec{P}_{obs} = \smat{\sfrac{5}{12} \s\s \sfrac13 \s\s \sfrac14}$.\n* Gamma: $\svec{P}_{obs} = \smat{\sfrac{5}{12} \s\s \sfrac{5}{12} \s\s \sfrac16}$.\n* Delta: $\svec{P}_{obs} = \smat{\sfrac16 \s\s \sfrac16 \s\s \sfrac23}$.\n\n<<<\n* (b) In each of the cases of (a), suppose you get the result corresponding to "1". Write down (for each model) the probabilistic state describing your knowledge of how the Trie lies //after// the observation.\n<<<\n\nIn each case we apply Bayes' Rule, which says that after observing outcome $k$ described by indicator function $I_k(n)$, we update to $P'(n) \spropto P_0(n)I_k(n)$, and normalize.\n* Alpha: $\svec{P}' = \smat{\sfrac23 \s\s \sfrac16 \s\s \sfrac16}$\n* Beta: $\svec{P}' = \smat{\sfrac45 \s\s \sfrac{2}{15} \s\s \sfrac{1}{15}}$\n* Gamma: $\svec{P}' = \smat{\sfrac45 \s\s \sfrac15 \s\s 0}$\n* Delta: $\svec{P}' = \smat{0 \s\s 0 \s\s 1}$\n\n<<<\n* (c) For each of the cases in (a), you are asked to make your best guess (maximizing probability of guessing correctly) as to how the Trie lies. What is the probability that you guess correctly, both: (i) //before// the observation, and (ii) //after// the observation? For which models did the observation help you? (//Note: ignore part (b) here; do __not__ assume that you get result "1"//).\n<<<\n\n(i) Before the observation, we have nothing but our //prior// knowledge of the probabilities of the various values of $n$ to go on. The best guess is the most probable value, and the probability that we guess correctly is simply its probability. Thus:\n* Alpha: $P_{correct} = \sfrac13$\n* Beta: $P_{correct} = \sfrac12$\n* Gamma: $P_{correct} = \sfrac12$\n* Delta: $P_{correct} = 1$\n\n(ii) We already determined the probabilities $P_{obs}(j)$ of the various outcomes $j$ in part (a). For each of the three possible outcomes $j$, we then apply Bayes' Rule to update our probability distribution to $P(n|j)$. The best guess is the //mode// of the resulting state -- i.e., the $n'$ such that $P(n'|j)>P(n|j)$ for all other $n$. The probability that this guess is correct is just $P(n'|j) = \smax_n{P(n|j)}$. Thus, the total probability of guessing correctly is the average of this quantity over all the possible outcomes $j$, or\n$$P_{correct} = \ssum_j{P_{obs}(j)\smax_n{P(n|j)}}$$.\n* Alpha: This one is easy; each outcome $j$ is equally probable, and no matter which one we get, the best guess is $n=j$, which is correct with probability $\sfrac23$. Thus $P_{correct} = \sfrac23$.\n* Beta: Working through Bayes' Rule, we find again that the best guess is $n=j$, and this guess is correct with probabilities $P(1|1)=\sfrac45$, $P(2|2)=\sfrac23$, and $P(3|3)=\sfrac49$. Thus $P_{correct}=\sfrac{5}{12}\sfrac45 + \sfrac13\sfrac23+\sfrac14\sfrac49 = \sfrac23$.\n* Gamma: In this case, if we observe "1" or "2" then the best guess is $n=j$ and this guess is correct with probability $P(1|1)=P(2|2)=\sfrac45$. If we observe "3" then we //know// the measurement has failed, and we fall back on guessing either "1" or "2", getting it right with probability $\sfrac12$. Thus $P_{correct} = 2\sfrac{5}{12}\sfrac45 + \sfrac16\sfrac12 = \sfrac34$.\n* Delta: The measurement is pointless; we already know $n=3$, so $P_{correct}=1$.\n\n* The observation helps in all cases except the pure state (Delta), where we already have complete knowledge.\n\n<<<\n* (d) I fill an urn with 100 Tries -- 50 "Beta" models and 50 "Gamma" models -- then pull one out at random and throw it so that you cannot see either which //kind// of Trie it is, or how it falls. Write down a probabilistic state describing your knowledge of how it lies.\n<<<\n\nI picked a Beta or a Gamma Trie with probability $\sfrac12$ each, so the state describing the unknown Trie is a convex combination:\n$$\svec{P} = \sfrac12\svec{P}_\sbeta + \sfrac12\svec{P}_\sgamma = \smat{\sfrac12 \s\s \sfrac{5}{12} \s\s \sfrac{1}{12}}$$\n<<<\n* (e) You are asked to make an observation on the Trie thrown in (d), and then to decide whether I selected a Beta or a Gamma model. Write down an observation (as a set of indicator functions) //with only two outcomes// that achieves the highest achievable probability of guessing correctly, and state the probability of guessing correctly both before and after the observation.\n<<<\n\nThere are two possibilities: either I picked a Beta or a Gamma. You can describe your initial knowledge of this by a probability distribution over the sample space $G = \s{\sbeta, \sgamma\s}$, and this distribution is $\svec{p_0} = \smat{\sfrac12 \s\s \sfrac12}$. The goal of an observation is to help you decide which is //more// probable, //given// the observation -- which is a job for Bayes' Rule.\n\nAn observation of $n$ tells you something about whether I picked Beta or Gamma, because each outcome corresponds to an indicator function on the sample space G:\n$$\svec{I}_n = \smat{ P_\sbeta(n) & P_\sgamma(n) }$$.\nThis is a little tricky, and deserves some careful thought -- the elements of the indicator function do //not// form a probability distribution, but they do represent the relative probabilities of observing $n$ if the thrown Trie was a Beta or Gamma model (respectively).\n\nThe full observation of $n$ thus corresponds to indicator functions on G:\n$$\sleft\s{ \smat{ \sfrac12 \s\s \sfrac12 }, \smat{ \sfrac13 \s\s \sfrac12 }, \smat{ \sfrac16 \s\s 0 } \sright\s}.$$\nSince your prior knowledge is unbiased -- i.e., $\svec{p_0} = \smat{\sfrac12 \s\s \sfrac12}$ -- your state $\svec{p}'$ //after// an observation of $n$ is determined completely by the corresponding indicator function, and is in fact proportional to it. Thus, for each value of $n$ observed, you should guess "Beta" if $P_\sbeta(n) > P_\sgamma(n)$, and "Gamma" if $P_\sgamma(n) > P_\sbeta(n)$... and if the two are equal, then you can guess either way.\n\nIn other words, you do not need to observe $n$; you merely need to observe the sign function $\smathrm{sgn}(P_\sbeta(n)-P_\sgamma(n))$, which is +1 if its argument is non-negative and -1 if it is negative (//Note: the sign function can also be defined to take values $\s{\spm1,0\s}$, but this definition is more convenient here.//) This is not a complete observable -- it has two indicator functions, one corresponding to all the values of $n$ where $P_\sbeta(n) \sgeq P_\sgamma(n)$, and the other corresponding to the values where $P_\sbeta(n) < P_\sgamma(n)$. In this particular case, this observation is\n$$ \s{ \svec{I}_\sbeta = \smat{1 & 0 & 1}, \svec{I}_\sgamma = \smat{0 & 1 & 0} \s},$$\nwhich means that you are going to guess "Beta" if you get $n=1$ or $n=2$, and "Gamma" if you get $n=3$. However, since $P_\sgamma(1) = P_\sbeta(1)$, we can guess either way if we get $n=1$, or even guess randomly, so any observation \n$$ \s{ \svec{I}_\sbeta = \smat{p & 0 & 1}, \svec{I}_\sgamma = \smat{1-p & 1 & 0} \s},$$\nis a valid solution!\n\nThe probability of guessing correctly is\n$$ P_{correct} = P(\sbeta)P(I_\sbeta|\sbeta) + P(\sgamma)P(I_\sgamma|\sgamma) = \sfrac12\sleft(P_\sbeta(1)+P_\sbeta(3)\sright) + \sfrac12P_\sgamma(2) = \sfrac{7}{12}.$$\n\n<<<\n* (f) Generalize part (e) to an arbitrary $n$-state system, where I may have chosen to prepare either state $\svec{p} = \s{p_1\sldots p_n\s}$ or $\svec{q} = \s{q_1\sldots q_n\s}$, with equal //prior probabilities// for $\svec{p}$ and $\svec{q}$. You can perform any observation on a single sample of the unknown state; show that the highest probability of guessing correctly is given by $P_{correct} = \sfrac12 + \sfrac14\ssum_n{|p(n)-q(n)|}$.\n<<<\n\nMost of the reasoning is in the solution to (e). We could simply measure $n$, and then we would guess that $\svec{p}$ was prepared if we got a value of $n$ for which $p(n) > q(n)$, and that $\svec{q}$ was prepared if we got a value of $n$ for which $q(n) > p(n)$. If we get one for which $q(n)=p(n)$ then it doesn't matter which we guess; we have a 50% chance either way.\n\nThis can be represented, again, as a 2-outcome measurement of the observable $\smathrm{sgn}(p(n)-q(n))$. The probability of success is given by\n$$ P_{correct} = P(\svec{p})P(I_{\svec{p}}|\svec{p}) + P(\svec{q})P(I_{\svec{q}}|\svec{q}) = \sfrac12\sleft(P(I_{\svec{p}}|\svec{p}) + P(I_{\svec{q}}|\svec{q})\sright),$$\nand a moment's thought shows that this can be written as\n$$ P_{correct} = \sfrac12\ssum_n{\smathrm{max}[p(n),q(n)]}.$$\nThis is a fairly nice looking result, but we can improve it by using the following rather cute identity:\n$$ \smathrm{max}[p(n),q(n)] = \sfrac{p(n)+q(n)}{2} + \sfrac{|p(n)-q(n)|}{2}, $$\nwhich follows from the fact that $p(n)$ and $q(n)$ are spaced equally far from their mean. So we plug that in and recall that $\ssum_n{p(n)} = \ssum_n{q(n)} = 1$ and obtain\n$$P_{correct} = \sfrac12 + \sfrac14\ssum_n{|p(n)-q(n)|}$$\nThe quantity $\ssum_n{|p(n)-q(n)|}$ is called the ''1-norm'' of the vector $\svec{p}-\svec{q}$. It not only has a nice interpretation here as the //distinguishability// of two probabilistic states, but has a nice generalization to the comparable quantum problem!\n\n<<<\n4. Diagonalizing observables and unitary matrices.\nSuppose $A$ is a Hermitian matrix (i.e., a representation of an operator $\smathbf{A}$ on a finite-dimensional Hilbert space). Let us call the basis in which $A$ is written $\s{\sket{1},\sket{2},\sldots\sket{n}\s}$, so $A_{ij} \sequiv \sbraopket{i}{\smathbf{A}}{j}$.\n* (a) The Spectral Theorem implies that $A$ has a complete set of eigenkets $\s{\sket{a_i}\s}$ with eigenvalues $\s{a_i\s}$. Write down a matrix $U$ such that $UAU^\sdagger$ is diagonal.\n<<<\n\nIf we choose $U^\sdagger$ to be the matrix whose columns are the eigenvectors of $A$, $U$ will diagonalize $A$. Recall that kets are column vectors, so we write\n$$U^\sdagger = \ssum_k{\sketbra{a_k}{k}} \sLongrightarrow U = \ssum_k{\sketbra{k}{a_k}}$$\nTo prove that this works, note that\n$$UAU^\sdagger = \ssum_{k,l}{\sketbra{k}{a_k}A\sketbra{a_l}{l}} = \ssum_{k,l}{a_l\sket{k}\sbraket{a_k}{a_l}\sbra{l}} = \ssum_{k,l}{a_l\sdelta_{k,l}\sketbra{k}{l}} = \ssum_{k}{a_k\sproj{k}},$$\nso there are no off-diagonal elements.\n\n<<<\n* (b) Prove that $U$ is //unitary//.\n<<<\n\nA matrix $U$ is unitary if $U^\sdagger U = \sId$. In this case, since $U = \ssum_k{\sketbra{k}{a_k}}$,\n$$U^\sdagger U = \ssum_{k,l}\sketbra{a_k}{k}\sketbra{l}{a_l} = \ssum_{k,l}\sdelta_{k,l}\sketbra{a_k}{a_l} = \ssum_{k}\sketbra{a_k}{a_k} = \sId,$$\nwhere in the last step we've used the fact that the sum of the projectors onto any basis is equal to the identity operator.\n\n<<<\n* (d) Write $UAU^\sdagger$ in Dirac notation. Your answer should make it clear what the diagonal elements are.\n<<<\n\nFrom part (a), $UAU^\sdagger = \ssum_{k}a_k\sproj{k}$. The diagonal elements are the eigenvalues $a_k$.\n\n<<<\n* (c) The //commutator// of two matrices $A$ and $B$ is defined as $[A,B] = AB-BA$. Prove that $U$ commutes with its adjoint -- i.e., $[U,U^\sdagger]=0$.\n<<<\n\n$\sleft[U,U^\sdagger\sright] = UU^\sdagger - U^\sdagger U$. We already proved that $U^\sdagger U = \sId$, and now we observe that\n$$UU^\sdagger = \ssum_{k,l}\sketbra{k}{a_k}\sketbra{a_l}{l} = \ssum_{k,l}\sdelta_{k,l}\sketbra{k}{a} = \ssum_{k}\sketbra{k}{k} = \sId,$$\nso $\sleft[U,U^\sdagger\sright] = UU^\sdagger - U^\sdagger U = \sId - \sId = 0$.\n\n<<<\n* (d) A matrix that commutes with its adjoint is called //normal//, and the Spectral Theorem can be proved for all normal matrices. Use this to prove that the eigenvalues of $U$ are $\s{e^{i\stheta_k}\s}$, where the $\stheta_k$ are real numbers.\n<<<\n\n//Proof:// By the Spectral Theorem, $U$ has eigenvalues $u_k$ and corresponding eigenvectors $\s{\sket{u_k}\s}$. We can therefore construct a unitary $V$ so that $D = VUV^\sdagger$ is diagonal with diagonal entries $\s{u_k\s}$. Therefore, $U = V^\sdagger D V$ and $U^\sdagger = V^\sdagger D^\sdagger V$. Using these identities, we can write\n$$\sId = U^\sdagger U = V^\sdagger D V V^\sdagger D^\sdagger V = V^\sdagger D D^\sdagger V$$,\nand by multiplying by $V$ and $V^\sdagger$ on the left and right (resp) of both sides, we get \n$$D D^\sdagger = \sId.$$\nSince $D$ and $D^\sdagger$ are diagonal, the diagonal entries of $D D^\sdagger$ are $u_k^*u_k$, and since the previous equation implies they are also equal to 1, we have $u_k^* u_k=1$, which means that $u_k = e^{i\stheta_k}$ for some real numbers $\stheta_k$. QED.\n\n<<<\n* (e) The //infinity norm// of an operator $X$, denoted $||X||_\sinfty$, is the absolute value of $X$'s largest eigenvalue. It is a good measure of the overall magnitude of an operator (e.g., if $||X||_\sinfty=0$, then $X=0$). Prove the following theorem: For any real number $\sepsilon>0$, there is an integer $n>0$ such that $||U^n-\sId||_\sinfty < \sepsilon$.\n<<<\n\n[>img[images/HW2/HW2Sols-phases.png]]\nThis is a hard problem! -- mathematically quite a bit more sophisticated than anything else in this set. Because the time evolution of quantum systems is represented by a family of unitary operators like this, we're basically proving a quantum equivalent of the [[Poincare recurrence theorem|http://en.wikipedia.org/wiki/Poincare_recurrence]] -- i.e., that if a quantum system is allowed to run for long enough, then it will return to its starting state.\n\nLet us work in the eigenbasis of $U$, since it has one, and let $d$ be the dimension of $U$. Both $U$ and $\sId$ are thus diagonal ($\sId$ is diagonal in every basis), and $U$ has eigenvalues $e^{i\stheta_k}$ for $k=1\sldots d$. Furthermore, $U^n$ has eigenvalues $e^{in\stheta_k}$. We are going to regard these eigenvalues, $\s{e^{in\stheta_k}\s}$, as dynamical variables that evolve as we increase $n$. We are trying to prove that there is some $n$ such that $||U^n-\sId||_\sinfty < \sepsilon$, so note that\n$$||U^n-\sId||_\sinfty = \smathrm{max}_k\sleft(|e^{in\stheta_k}-1|\sright) = \smathrm{max}_k\sleft(\sleft|\ssin\sleft(\sfrac{n\stheta_k}{2}\sright)\sright|\sright).$$\nSince $|\ssin(x)|\sleq|x|$ for all real $x$, the condition is satisfied if, for all $k$, $n\stheta_k\smathrm{\s mod\s }2\spi < 2\sepsilon$.\n\nIf we relax the condition that $n$ be an integer, and let $n$ be a positive real number, then the phases $\s{n\stheta_k\smathrm{\s mod\s }2\spi\s}$ of $U^n$ for any $n$ represent a point on a $d$-torus. Furthermore, as $n$ increases from zero, they trace out a trajectory on the torus (see Figure 1), which wraps around whenever one of the $n\stheta_k$ passes an integer multiple of $2\spi$. This is the trajectory of the dynamical map $\s{\sphi_k\s} \srightarrow \s{\sphi_k + n\stheta_k\s}$ (it should be obvious that if we start with $\s{\sphi_k=0\s}$, then applying this map for integer $n$ gives the phases of $U^n$). Note also that this dynamical map is //linear// in the $\stheta_k$, so it preserves the shape of regions on the torus (see Figure 2).\n\n[>img[images/HW2/HW2Sols-phases2.png]]\nWe wish to show that for some $n$ (and all $k$), $\sleft(n\stheta_k\smathrm{\s mod\s }2\spi\sright) < 2\sepsilon$. This is equivalent to the point $\s{n\stheta_k\smathrm{\s mod\s }2\spi\s}$ lying within a hypercube of sidelength $2\sepsilon$ centered at the origin (red square in Fig. 1) -- which, in turn, is exactly equivalent to the existence of //overlap// between a hypercube of sidelength $\sepsilon$ centered at $\s{n\stheta_k\smathrm{\s mod\s }2\spi\s}$ and another hypercube of sidelength $\sepsilon$ centered at the origin (green squares in Fig. 2).\n\nThe key to the proof is the fact that the entire torus has volume $(2\spi)^d$, and each hypercube of sidelength $\sepsilon$ centered at $\s{n\stheta_k\smathrm{\s mod\s }2\spi\s}$ has volume $\sepsilon^d$. As we increase $n$, we litter the torus with more and more of these regions (see Fig. 2), and eventually we run out of room! If $n>\sleft(\sfrac{2\spi}{\sepsilon}\sright)^d$, then there must exist some $n'\sleq n$ and some $m\sleq n$ such that the hypercubes around $\s{n'\stheta_k\smathrm{\s mod\s }2\spi\s}$ and $\s{m\stheta_k\smathrm{\s mod\s }2\spi\s}$ overlap.\n\nSince the dynamical map is linear (//note: the weaker condition of "area-preserving" is sufficient//), this implies that the hypercubes for $n'-1$ and $m-1$ also overlap, and by induction the hypercube for $N = n'-m$ overlaps the one at the origin -- which means that $N\stheta_k\smathrm{\s mod\s }2\spi < 2\sepsilon$. QED.
The answers to these problems are due, via email to ta@am473.ca, by Wednesday October 15 at 11:30 AM. Please follow the [[Homework Policy]] for this class.\n//Note: I have attempted to write all operators in bold face for clarity of notation, except the Pauli operators $\ssigma_x,\ssigma_y,\ssigma_z$. Terms relating to group theory and distribution theory that you may not be familiar with are linked to the appropriate Wikipedia article; the group theory concepts are very basic ones with which most of you are familiar, while the distribution theory terms are simply ones that you may wish to learn more about.//\n\n1. ''The Quantum Zeno Effect'' (25 pts)\nConsider an idealized Stern-Gerlach apparatus, and let the silver atoms' angular momentum state be described by a state in a 2-dimensional Hilbert space with basis kets $\sket\suparrow$ and $\sket\sdownarrow$ representing "spin up" and "spin down", respectively, along the $\shat{z}$ axis.\n* (a) Suppose we want to prepare a beam of silver atoms in the $\sket{\suparrow}$ state. How (briefly) could this be done? (2 pts)\n* (b) The beam from (a) is fed into a Stern-Gerlach apparatus measuring $J_x$. We observe that there are two possible outcomes to the measurement, corresponding to $J_x = \spm\sfrac{\shbar}{2}$. Write down (in terms of the base kets above): (i) a set of bras corresponding to the measurement of $J_x$; (ii) the measurement itself as a set of projectors; (iii) the operator representing the observable $\smathbf{J_x}$; (iv) the probabilities of the outcomes $\spm\sfrac\shbar2$ when silver atoms prepared as in (a) are fed in. (8 pts)\n* (c) In a new experiment, the atoms from (a) are fed into a Stern-Gerlach apparatus oriented at angle $\sfrac{\spi}{4}$ between the $\shat{x}$ and $\shat{z}$ axes, which measures an observable we will call $\smathbf{J_{\spi/4}}$. What are the probabilities for the various outcomes? (3 pts)\n* (d) After the measurement in (c), all the atoms are recombined into a single beam. Write down its state. (2 pts)\n* (e) The beam from (d) is fed into a Stern-Gerlach apparatus measuring $\smathbf{J_x}$ [a la part (b)]. What are the probabilities of the $\spm\sfrac\shbar2$ outcomes? (3 pts)\n* (f) In yet another experiment, we pass the beam from (a) through //three// consecutive Stern-Gerlach apparatuses, oriented at $\sfrac\spi6$, $\sfrac\spi3$, and $\sfrac\spi2$ to the $\shat{z}$ axis (so the last is, again, measuring $\smathbf{J_x}$). After each measurement, the beams are recombined. What are the probabilities for the final $\smathbf{J_x}$ measurement? (4 pts)\n* (g) Consider the limit where we chain together $N$ Stern-Gerlach apparatuses, the $n$th of which is oriented at $\stheta = \sfrac{n\spi}{2N}$, and $N\srightarrow\sinfty$. What are the probabilities for the final $\smathbf{J_x}$ measurement? (3 pts)\n\n2. ''State and prove the Robertson Uncertainty Relation'' (15 pts)\n\n3. ''Unitary transformations and the Bloch sphere'' (20 pts)\n* (a) Consider a spin-$\ssmall{\sfrac12}$ system, and let $\svec{n}$ be a __unit__ vector in