Proposal: Improved Type Inference for Generic Instance Creation (original) (raw)

Jeremy Manson jeremy.manson at gmail.com
Fri Feb 27 23:54:07 PST 2009


Hey Neal,

I think you might have missed where I asked for an example where G1 and G2 might have different numbers of type parameters or they might have corresponding parameters in a different order. I'm probably missing a very obvious example, and I'm sure you have an example in mind.

On Fri, Feb 27, 2009 at 11:33 PM, Neal Gafter <neal at gafter.com> wrote:

Adding conversions doesn't make type inference work.  Each conversion that you want to affect type inference has to be accommodated specifically in the type inference algorithm.

I wasn't suggesting that adding conversions would make type inference work. I was suggesting that the conversion could be done in place of the type inference.

I disagree with your suggestion that this should fail:

List<?> list = new ArrayList<>(); // should fail The equivalent code using static factories works just fine.  It isn't clear why this feature should behave differently, or what part of your specification makes it fail.

Basically, because I specifically disallowed wildcards under the sample spec. See the second bullet point under "Semantics and Compilation". However, I wasn't convinced that that was a good idea either, which is why I put it under Additional Features (at the end). The described technique under that header would make this example work (probably -- I'd have to double-check the JLS).

I don't think supporting casts adds value to the feature.  You can already assign, for example, from ArrayList to List without a cast.

That was where Josh was leaning when I asked him about it, too. The reason I thought it made sense was just for consistency's sake; basically, it is the only one of the five conversion contexts where this technique might work, but it doesn't. For assignments and method parameters, it will work, and for String and numeric conversions, it doesn't make any sense. Casting was the only one left.

If you want to hear about the approach that Joe is working on, I'll be happy to describe it in person at your convenience (with Joe, if he wants).

I would definitely like to hear about it! How close is Joe to finishing a draft?

Jeremy

On Fri, Feb 27, 2009 at 10:56 PM, Jeremy Manson <jeremy.manson at gmail.com> wrote:

Neal,

I thought you had disappeared into the wilderness!  I might not have written this if I thought that you were on the job. Anyway, it sounds as if you have an example in mind where G1 and G2 can have different type parameters.  I have to say that the only ones that come to mind are the wildcard examples I give at the end of the proposal (where you have to infer the type's LUB).  I was only going to pursue that if people thought it was a good idea.  I imagine you have some others in mind? I did look at making modifications to the type inference algorithm, but I thought I could get away without it if you basically treat the entire operation as a conversion.  It makes the proposal simpler not to have to tackle that particular issue.  It seems less likely that this would be possible if G1 and G2 have to be able to have different type parameters. What do you think of the idea in the proposal for using this for casts?  Do you think it makes sense to be able to say: List list = (List<>) new ArrayList(); Jeremy On Fri, Feb 27, 2009 at 10:28 PM, Neal Gafter <neal at gafter.com> wrote: Jeremy-

Where your spec says "the type parameters of G1 in the conversion context are deemed to be identical to those of type G2" seems to assume that the types G1 and G2 are related in a way that their type parameters align exactly.  That is not the case in general.  G1 and G2 might have different numbers of type parameters or they might have "corresponding" parameters in a different order. The most significant problem with this proposal is that it completely ignores the most difficult aspect of this feature - the required modifications to the type inference algorithm specified in the JLS (15.12.2.7 and 15.12.2.8).  Joe Darcy and I discussed how to specify that on Friday, and I believe he is preparing a proposal. Regards, Neal On Fri, Feb 27, 2009 at 10:10 PM, Jeremy Manson <jeremy.manson at gmail.com> wrote: Improved Type Inference for Generic Instance Creation

(Sorry about the length; I probably got a little carried away) Author: Jeremy Manson Overview Feature Summary This proposal addresses the addition of limited type inference for class instance creation expressions to the Java programming language. In cases where parametrized types need to be explicitly declared for a constructor, and the full parametrized type < T1 , T2 , ...Tn > of that constructor is obvious from the context, then the  parameterized type of the constructor can be replaced with an empty set of type parameters: <>. The <>  construct is legal to use when constructing an ob ject and either assigning it to a variable, or when passing  it as a parameter. For example, consider the following assignment statement: Map<String, List> anagrams = new HashMap<String, List>(); This is rather lengthy, so it can be replaced with this: Map<String, List> anagrams = new HashMap<>(); Major Advantages Generics have had a tremendously positive impact on type safety in the Java programming language. They  have made it possible to provide static guarantees about the type of instances used by other classes, preventing entire classes of runtime errors from occurring. However, generics have added complexity to Java, making the code far more verbose and occasionally difficult to read. Although solving this problem is well outside the scope of this proposal, a limited form of type inference would remove unnecessary redundancy from the language. The requirement that type parameters be duplicated unnecessarily like this encourages an unfortunate overabundance of static factory methods, simply because type inference works on method invocations. For example, the Google collections library [2] contains a class that wraps every possible constructor of every subclass of java.util.List in the JDK: public class Lists {  public static List newArrayList() { return new ArrayList(); }  public static List newArrayList(int i) { return new ArrayList(i); }  // ... } List list = Lists.newArrayList(); This approach avoids the redundancy in the declaration, but requires two declarations for every possible constructor in every possible class. This is ugly: every time a constructor or a new subtype of List is created, this class must be updated. Furthermore, the names of these methods does not have to be consistent; there is no reason to favor newArrayList over createArrayList. The proposed feature would make this approach obsolete. Major Disadvantages As a minor change, the impact of this feature would have few disadvantages. It does have benefits and drawbacks when compared with other implementations, though. For example, it could be argued that inference would look cleaner without the <> token: Map<String, List> list = new Map(); Unfortunately, for the purposes of backwards compatibility, new Map() indicates a raw type, and therefore cannot be used for type inference. An argument can also be made for more full type inference: auto map = new HashMap<String, String>(); map.get("Foo"); // map is understood to be a Map<String, String> or, alternatively, for a typedef like feature, where type identifiers are assigned to longer typenames: typedef MyMap HashMap<String, List>; MyMap m = new MyMap(); Both of these approaches, which are present in the forthcoming C++ standard, would eliminate a lot of verbosity, at the expense of (potentially) obscuring the type of the variable. Neither of these proposals are precluded by this proposal; one, both or neither can be done at a later date. 2 Examples Here is a simple example, showing a use case for type inference for blank finals. An ImmutableMap is a Map implementation that throws an exception when mutation is attempted. In this example, the programmer writes <String, List> twice instead of four times, and once instead of twice: class AllAnagrams {  final Map<String, List> anagrams;  AllAnagrams(List originals) {  Map<String, List> map = new HashMap<>(); // saves typing  for (String s : originals) {  String alphagram = alphagram(word);  List group = anagrams.get(alphagram);  if (group == null)  anagrams.put(alphagram, group = new ArrayList<>()); // saves typing  group.add(word);  } anagrams = new ImmutableMap<>(map); // saves typing  }  private static String alphagram(String s) {  char[] chars = s.toCharArray();  Arrays.sort(chars);  return String.valueOf(chars);  } } Here is another example, showing a use case for type inference in parameter passing: class TreeNode {  T value;  TreeNode left, right;  private List<List> buildPathsInternal(  List<List> list, List currPath, TreeNode node) {  if (node == null) {  // type inference does not save typing here; see section on Wildcards.  list.add(new ArrayList(currPath));  return list;  }  currPath.add(node);  buildPathsInternal(list, currPath, node.left);  buildPathsInternal(list, currPath, node.right);  currPath.remove(node);  return list;  }  public List<List> getAllPaths(TreeNode head) {  // Type inference saves typing here:  return buildPathsInternal(new ArrayList<>(), new LinkedList<>(), head);  } } Details Specification Informally, the specification adds syntax to replace the type arguments < T1 ...Tn > to a class being instanti- ated by a call to new with the simple token <>. It also provides a new conversion that infers what the type arguments would have been from the context, and treats the instance creation as if those type arguments were present. Syntax The syntax of the new statement is currently given in two different ways in the Java Language Specification (JLS) [1] §15.9 and JLS§18.1 For simplicity, we restrict ourselves to the grammar given in §15.9. ClassInstanceCreationExpression ::=  new TypeArgumentsopt ClassOrInterfaceCreationType ( ArgumentListopt ) ClassBodyopt  Primary . new TypeArgumentsopt Identifier PossiblyInferredTypeArguments ( ArgumentListopt ) ClassBodyopt ClassOrInterfaceCreationType ::=  TypeDeclSpecifier PossiblyInferredTypeArgumentsopt PossiblyInferredTypeArguments ::=  <>  TypeArgumentsopt Semantics and Compilation Consider G1 , a generic type that is declared with 0 type arguments, and G2 ⟨T1 , T2 , ..., Tn ⟩, a generic type that is declared with n type arguments. There is a Type Inference Conversion from G1 to G2 if and only if  • G2 ’s erased type is a supertype of G1 ’s erased type and  • all type arguments T1 , T2 ...Tn uniquely identify reference types (i.e., they have no wildcard type arguments, although see below for more discussion of this). A Type Inference Conversion can take place in an assignment or method invocation conversion context (JLS §5). It might be possible to allow the inference to take place in a casting conversion context; see Section 6.1 for more discussion of this. Type Inference Conversions are unique in that they are not actually performed. Instead, they are used to replace the type they would ordinarily be converting: if a Type Inference Conversion can take place, then the type parameters of G1 in the conversion context are deemed to be identical to those of type G2 . Compilation The resulting bytecode would be identical to bytecode generated by the same code with the types inferred. Testing The proposed construct can be tested by writing a number of statements that construct instances using inferred types. These instances can then be assigned to variables and fields of specific types, and passed to methods that expect specific types. These tests should compile and run as expected: Set set = new SortedSet<>(); set.add("A"); set.add("B"); set.add("C"); // verify that set contains "A", "B", "C". Other tests can be performed by attempted to place new instances of inferred types in contexts that contain wildcards; these tests should fail: List<?> list = new ArrayList<>(); // should fail List list = new ArrayList<>(); list.add("A"); list.addAll(new ArrayList<>()); // should fail, since addAll expects  // Collection<? extends String> Library Support No library support is required, although libraries can be rewritten to take advantage of the new feature. Reflective APIs No reflective APIs need to be changed. The only related API is Class.newInstance; however, because generics are not reified in Java, this API does not take type parameters. A compiler cannot infer non- existent type parameters. Other Changes No other parts of the platform need to be updated. Migration Libraries can be rewritten to take advantage of this feature, but it is not imperative that they do so. Compatibility This change is backwards compatible. It will not make any changes to break existing compilation strategies or existing code. References Existing Bugs: 4879776, 6242254, 5092426, 6220689, 4983159, 6368076 [1] James Gosling, Bill Joy, Guy Steele, and Gilad Bracha. The Java Language Specification, 3rd Edition. Addison Wesley, 2004. [2] Google Inc. Google Collections Classes. Additional Features Casting Conversion It would be possible to specify Type Inference Conversion so that it takes place in a Cast Context: List emptyList() {  return (List) new ArrayList<>(); } This would keep the casting context consistent with the other conversion contexts. However, the utility of this is questionable, so an argument could be made that this is a needless change. Wildcards In the method buildPathsInternal in Section 2, we need to use ArrayList’s copy constructor to create a copy of a List, but cannot: if (node == null) {  list.add(new ArrayList(currPath));  return list; } The inference does not work here because the ArrayList constructor expects an argument of type Collection<?_ _extends E>. The approach to inference described above needs to be able to determine the exact type of the type parameters being inferred. Since we don’t have an exact type here, the inference will fail. One possible tweak to our approach would be the ability to infer an actual type parameter from a wildcard. For example, if the least upper bound of a wildcard expression, when evaluated, would be a legal type parameter for the type being instantiated, we might consider inferring that type. This would make it possible to use type inference in the example above: if (node == null) {  // <List> is inferred from Collection<? extends E>  list.add(new ArrayList<>(currPath));  return list; } Further inference for method parameters As pointed out above, the result of a generic method: public static List newArrayList() { return new ArrayList(); } can be used for assignment: List list = Lists.newArrayList(); Although this construct can be used for assignment, it cannot be used in parameter passing when a type argument is required: public void expectStringList(List list) { ... } expectStringList(Lists.newArrayList()); // doesn’t compile expectStringList(Lists.newArrayList()); // does compile The fact that inference works for assignment but not for parameter passing is very confusing and arbitrary. Changing this would be consistent with the way type inference is presented in this proposal.



More information about the coin-dev mailing list