Homebrew, a very popular package manager for OS X, does not allow the user to install a specific version of a package.
Nor does it allow packages (“formulae” in its lingo) to specify versions or version ranges in their dependencies.
Instead, in Homebrew, packages just have names, and the names mean “the newest version released to Homebrew so far.”
—-
For example, here’s Ipython on PyPI and github. There, you can see lots of different versions, and you can see the newest ones require python >= 3.7, as advised in NEP 0029.
… and here’s Ipython on Homebrew. There’s only one version, the latest one, whatever the latest one happens to be at $CURRENT_DATE.
And instead of depending on python >= 3.7, it requires python 3.8, which NEP 0029 will not demand until Dec 26, 2021. And work to bump that requirement to python 3.9 is apparently underway.
Formulae for apps that require Python 3 should declare an unconditional dependency on "python@3.x". These apps must work with the current Homebrew Python 3.x formula.
No more than five versions of a formula (including the main one) will be supported at any given time, regardless of usage. When removing formulae that violate this, we will aim to do so based on usage and support status rather than age.
[…]
Versioned formulae submitted should be expected to be used by a large number of people. If this ceases to be the case, they will be removed.
—-
Am I missing something, or is this really bad?
I’ve learned to call `brew install` as rarely as possible, because it will recursively update all dependencies of the thing I’m installing to Homebrew’s current versions – that’s the only thing it can do, no other versions “exist” – and this means replacing possibly large quantities of software that works fine with software that might not work.
And once that happens, you can’t get the old versions back. It was installed and running on your machine a moment ago, but to Homebrew it doesn’t exist anymore.
If you need to get old versions back, because you need your computer to work or some nonsense like that, you will probably find yourself reading this Stack Overflow thread, which has been chugging along since 2010 with no fully satisfying resolution. Some highlights:
~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~
¯\_(ツ)_/¯
Engineering is about trade offs. Latest version only and unconditional dependencies obviate the need for a SAT solver. Many homebrew packages expect to deal with untrusted input from the network. Latest version only greatly simplifies issues surrounding securing old versions of software and aligning lifecycles of dependencies with different release cycles. A ton of seemingly boring bugs get fixed and don’t get CVEs with backports to all stable branches because the security implications weren’t obvious to whoever found and fixed the bug.
Homebrew Python still provides pip, you can still spin up a virtualenv with a curated requirements.txt on Homebrew Python if that floats your boat.
Homebrew still needs its Python to support end user Python apps shipped as part of homebrew, including some apps that are pretty strongly evergreen. (Someone around here had a rant about youtube-dl in Ubuntu being broken by the time the distro releases).
If you need to exact point releases of all your dependencies, including a specific versions of postgres, docker might be the better fit for the job. I also hear good things about conda, but I can’t vouch for it, and the installers also seem to be tied to recent Python versions newer than NEP 0029 requires.
There are a bunch of things I’d rather see homebrew change before better support for version pinning. I’d love to see them get out of a shared /usr/local that lots of other things pollute, handle conflicting binaries better, and track better data about about when to rebottle due to changes in build time dependencies.
My real hot take about reproducable computing on mac is that it would be nice if macOS had a better container option for building and running macOS (not linux) software.
Most of this is over my head – which is not a criticism. I’m not very familiar with package management in general, and I wrote the OP thinking maybe this behavior is normal and I’m just not used to it.
However, insofar as I understand your argument, I’m not convinced. It sounds like you’re arguing that, because Homebrew forces the user into all new releases, users of Homebrew will stay up to date with security patches:
Latest version only greatly simplifies issues surrounding securing old versions of software […] A ton of seemingly boring bugs get fixed and don’t get CVEs with backports to all stable branches because the security implications weren’t obvious to whoever found and fixed the bug.
But this cuts both ways. Experience has taught me not to ever run `brew install` or `brew update` unless I have hours of spare time set aside to deal with the fallout if necessary. So, I never run those commands unless I’m forced to – which means that, usually, none of these patches reach my machine.
—-
Taking a step back: I don’t think I necessarily object to a lack of support for multiple package versions. (Since Homebrew is mostly a binary installer these days, I understand that supporting these would be a large cost for their build process.)
What I really object to is the inherent instability of Homebrew-core, the collection of packages you are pulling from when you run `brew install` or `brew update` as a typical user.
Unlike virtually any other mature project I interact with, Homebrew-core does not have versions or releases. It is a git repo with one branch, no tags, ~179000 commits to master, and ~59000 closed PRs.
Using an “up to date” Homebrew (which will happen unless you try hard to stop it) means using the very latest built commit to this master branch, which probably occurred within the last 24 hours.
—-
I’m not actually using Homebrew for development – I have a few dev tools installed through it, but I’m not looking for version pins so I can build software. I’m just trying to install software as a normal user, so I can use it.
And if something breaks, I want to be able to say “okay, I’ll try downgrading back to version 7.3.11″ or something like that. Some pointer to the thing I had before I updated. Like I get with any other software.
I can’t do that with Homebrew packages. I can’t do it with Homebrew-core, the collection of Homebrew packages. The closest things to version numbers are individual commits to homebrew-core master, and even then I don’t know which commit I was on yesterday, before I ran `brew update` (desperate times call for etc.)
I do know which commit I’m on now, though! `brew –version` tells me:
Homebrew/homebrew-core (git revision 8a34ac; last commit 2020-10-27)
Many commits have been made in the 22 hours since then, and every one makes all prior Homebrew configurations effectively unrecoverable, if usually in a superficially harmless way.
History moves forward and the past is erased. What will be true tomorrow? In a month? In six months?
And how will I even know the name of the ephemeral past I have lost? As “8a34acb309ba9d62b2d0377fe76c1a5731ddacc7″, a hash I was careful enough to write down this time around? Seriously?
Homebrew is awful for a number of reasons, but why use it to manage Python dependencies? Can't you just install IPython using pip? Ideally in a virtualenv?
You can, and these days I would.
However, installation pages for python applications often recommend installing them via Homebrew even if they are on PyPI, and some are not on PyPI at all. (I’m specifically talking about standalone applications, like development utilities, not libraries I want to use in the project when I’m developing.)
So, a python developer working on a Mac probably uses some tools that have been installed with Homebrew, unless they have been careful to avoid Homebrew from the moment they received the Mac. These then need to be kept up to date, etc. with Homebrew.
This category includes applications commonly used to manage virtualenvs or python itself, like pipenv, virtualenv, and pyenv.
The first time I installed pipenv was via Homebrew, because a setup tutorial at work told me to type `brew install pipenv`. This is now officially discouraged by pipenv (I don’t think it was at the time), for a reason I later encountered on my own: pipenv uses the python version which installed it to create its virtualenvs, and virtualenvs contain many symlinks to the python that created them – links which will point back to Homebrew python, and which will break if Homebrew ever decides to “update” python.
Homebrew, a very popular package manager for OS X, does not allow the user to install a specific version of a package.
Nor does it allow packages (“formulae” in its lingo) to specify versions or version ranges in their dependencies.
Instead, in Homebrew, packages just have names, and the names mean “the newest version released to Homebrew so far.”
—-
For example, here’s Ipython on PyPI and github. There, you can see lots of different versions, and you can see the newest ones require python >= 3.7, as advised in NEP 0029.
… and here’s Ipython on Homebrew. There’s only one version, the latest one, whatever the latest one happens to be at $CURRENT_DATE.
And instead of depending on python >= 3.7, it requires python 3.8, which NEP 0029 will not demand until Dec 26, 2021. And work to bump that requirement to python 3.9 is apparently underway.
Formulae for apps that require Python 3 should declare an unconditional dependency on "python@3.x". These apps must work with the current Homebrew Python 3.x formula.
No more than five versions of a formula (including the main one) will be supported at any given time, regardless of usage. When removing formulae that violate this, we will aim to do so based on usage and support status rather than age.
[…]
Versioned formulae submitted should be expected to be used by a large number of people. If this ceases to be the case, they will be removed.
—-
Am I missing something, or is this really bad?
I’ve learned to call `brew install` as rarely as possible, because it will recursively update all dependencies of the thing I’m installing to Homebrew’s current versions – that’s the only thing it can do, no other versions “exist” – and this means replacing possibly large quantities of software that works fine with software that might not work.
And once that happens, you can’t get the old versions back. It was installed and running on your machine a moment ago, but to Homebrew it doesn’t exist anymore.
If you need to get old versions back, because you need your computer to work or some nonsense like that, you will probably find yourself reading this Stack Overflow thread, which has been chugging along since 2010 with no fully satisfying resolution. Some highlights:
Do you have a take on the Performer paper? To someone not super familiar with NLP it seems massive, but I'm not sure if there's some hidden trade-off that you have to read between the lines to see.
I had not heard of it.
Looking it up now, it looks like part of the literature trying to overcome the quadratic complexity of attention. This is a big area and I haven’t read any of the papers in it, so I don’t have useful comments.
To put it in context, I’d start with gwern’s sub-page about efficient attention, which links a bunch of lit reviews and individual papers. As you can see from the table shown there – which includes Performer – many approaches to this problem have been published recently.
Did the Performer paper seem impressive to you because it got attention to O(n), or for some other/additional reason?
It seems like Python (or a similarly architected language) would need a specially designed data structure for efficiently manipulating large arrays/matrices. Furthermore, there would be a huge performance hit if you used native loops rather than broadcasting ops. So, do you object to the basic setup of Pandas, or do you think it just did a shit job of being a good library that does that?
Pandas gets these fast linear algebra tools from numpy, and I don’t object to numpy. Pandas adds things on top of numpy, and I object to those.
Pandas is not a package for fast linear algebra, it’s a package for running queries on 2D data structures that resemble queries on a relational database. So it introduces things like:
Named and typed “columns” (AKA “fields”).
This means we are thinking about matrices from a very different perspective from abstract linear algebra: not only do we fix a preferred basis, but the columns may even have different types that cannot be added to one another (say float vs. string).
(I mention this to emphasize that pandas is not just an obvious extension of numpy, nor is numpy obviously the right foundation for pandas.)
A typed “index” used to identify specific rows.
Operations similar to SQL select, join, group by, and order by.
In other words, interacting with a data table in pandas is similar to running SQL queries on a database. However, the pandas experience is (IME) worse than the SQL experience in numerous ways.
I’ve used pandas almost every day of my life for around three years (kind of sad to think about tbh), and I still frequently have to look up how to do basic operations, because the API is so messy. I never forget how to do a join in SQL: it’s just something like
SELECT […] FROM a
JOIN b
ON a.foo = b.bar
To do a join in pandas, I can do at least two different things. One of them looks like
and the other looks like (chopped off at my screen height!)
If on is None and not merging on indexes then this defaults to the intersection of the columns in both DataFrames. Got that?
Let’s not even talk about “MultiIndices.” Every time I have the misfortune to encounter one of those, I stare at this page for 30 minutes, my brain starts to melt, and I give up.
As mentioned earlier, the type system for columns doesn’t let them have nullable types. This is incredibly annoying and makes it next to useless. This limitation originates in numpy’s treatment of NaN, which makes sense in numpy’s context, but pandas just inherits it in a context where it hurts.
There’s no spec, behavior is defined by API and by implementation, those change between versions.
Etc., etc. It’s just a really cumbersome way to do some simple database-like things.
Thinking about your criticism of Jupyter (which I haven't used much, but which also applies to Mathematica which I use all the time), doesn't the same criticism apply to physical pen-and-paper notebooks, and blackboards? Especially if you use an eraser to go back and fix mistakes.
I don’t think so.
When someone writes out a calculation in a physical notebook, or on a blackboard, it doesn’t produce any “state” apart from what ends up in the minds of people reading it.
Every assumption used in the calculation is either explicitly written out, or supplied by the readers as tacit background knowledge. There is no category of “assumptions used in the calculation which the blackboard knows, but no reader could know.” A proof on a blackboard that uses information “only the blackboard knows” is just an invalid proof on a blackboard.
When someone writes out a calculation in a computer notebook (Mathematica, Jupyter, or the like), the notebook really does “know things” in a meaningful sense, and these are not just the implications of its current written contents. (Unless you are doing purely functional programming, which is one way to keep a notebook from getting weird.)
In the notebook’s mind, every step it computed in the current session still holds true, even if that step was later erased. This is different from what happens in a reader’s mind: the reader only considers the outcomes of the steps that they see in front of them. This divergence between two belief states is impossible with a physical notebook.
Generally we want the computer to know things we don’t. This is less important when doing pure math, where you still want to obtain a “blackboard-valid” proof at the end of your work. But it’s very important in numerics or data analysis, where we use the computer to work with huge matrices or data sets which would take huge quantities of paper just to write down, and even huger ones if we need to derive things about them.
So the set of facts which the notebook knows, but we don’t, contains important things that determine the results of our work. If we lose track of how this set of facts came to be, we don’t know what our work means anymore.
A simple example. Say I write this on a blackboard:
y = 2x
x = 1
⇒ y = 2
A valid derivation. Now, I erase the first line:
x = 1
⇒ y = 2
Now it’s just an invalid derivation. There is no sense in which the calculation is “still valid, but for mysterious reasons.”
In a computer notebook, I could write (assuming the notebook language handles assignment in a way resembling our notation on the blackboard):
>> y = 2x
>> x = 1
>> y
2
and now erase the first line:
>> x = 1
>> y
2
As on the blackboard, this is no longer a valid derivation for the reader. But the notebook remembers what was going on. So I can go on to derive further results that make sense, but only in light of the notebook’s secret knowledge:
>> x = 5
>> y
10
Because people regularly go back and fix their mistakes, this situation – with “secret knowledge” used but never written out – is the norm in computer notebooks. Preventing it requires special care, attentiveness, and discipline.
How the hell did Don DeLillo write a book as good as Libra right after writing one as bad as White Noise? Is it really even the same guy?? Did he use take kind of black-market writing-enhancement pills or what
i’m sure i’ve missed a few things, but i can’t stand to look at it any longer. i present to you: the good, the bad, and the ugly of tumblr throughout the decade