-
Notifications
You must be signed in to change notification settings - Fork 54
Tutorial: how to deal with strings #65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This would be a good tutorial to have as a reference 👍 I cannot find a reference right now, but I believe the |
I do exactly the same thing as your examples of function accepting and subroutine returning strings. I imagine this is common use. Question 1: You're doing nothing wrong. Gfortran is warning about correct Fortran. Question 2: module mod_str
contains
pure function str()
character(:), allocatable :: str
str = 'hello'
end function str
end module mod_str
program test_str
use mod_str, only: str
print *, str()
end program test_str |
Continuing Milan's example: program test_str
use mod_str, only: str
character(:), allocatable :: my_string
my_string = str()
end program test_str My understanding about this usage is that there are two allocation-on-assignments happening: one in the function for the function result; and one for the assignment at program level. |
(sorry, closed by mistake) |
@LKedward that is precisely why I asked about this. If that is the case, that seems like a big downside and our string routines in |
Yep, I haven't benchmarked it but this is why I generally avoid functions for returning non-scalars. You can use pointers to return allocated arrays from functions more efficiently, but I also avoid using pointers. NB: Allocation on assignmentAnother useful thing to note, which I only learned recently, is that allocation-on-assignment doesn't occur for colon subscripts ( So this doesn't work: program test_str
use mod_str, only: str
character(:), allocatable :: my_string
my_string(:) = str()
end program test_str Based on this, I would consider it good practice to use the colon subscript to explicitly indicate where there is assignment only and to avoid accidental reallocation. Question: filling a character stringI have my own related question for strings: Is there a one-liner for filling a character(*) with a non-space character(1)? |
Yes, I think this is true for any function returning anything allocatable. It's especially penalizing for large arrays. Don't do it if you care about high performance. I have a toy wave physics project that did this for everything, including large arrays. I was optimizing for functional API and UI, although at the time I didn't understand the implications of functions returning allocatable arrays. Later I heard from a person who found the code to do exactly what they needed but it was too inefficient so they rewrote everything to subroutines to make it fast :). |
Regarding functions returning allocatable --- is this mandated by the Fortran Standard to allocate twice, or are compilers permitted to make it as efficient as |
It would make sense that if the function is able to be inlined, then one allocation could be optimized out, but I'm no expert here. I think that in general, the function result needs to be a distinct memory location because it may be used subsequently in an expression; i.e. there is a fundamental difference between a function result and a subroutine Note 1, section 15.6.2.2 from the interpretation doc:
|
My understanding of the text you posted is that the Standard allows the result of the function to be as efficient as an |
Would such an optimization be prevented by the requirement that the RHS is evaluated before the assignment occurs? From 10.2.1.3:
for
|
I don't know. We might need to ask at the committee. My understanding of it is that the key is "shall have the same effect", in other words, it does not actually have to happen that way, only have the same effect. So the question then becomes if double allocation has the same effect as single allocation. For a string, it seems the logic of the code would be the same. For user derived types perhaps the user requires the finalizer to be called twice. |
Regarding Question1:
However, since we are into this discussion, I also have something to add about the behavior of allocatable characters that may be relevant. character(len=:),allocatable :: str
subroutine init_string(filename, str)
character(len=*),intent(in) :: filename
character(len=:),allocatable, intent(out) :: str
open(file...)
read(unit,*)str
close(file...)
end subroutine init_string while this is correct: character(len=:),allocatable :: str
subroutine init_string(filename, str)
character(len=*),intent(in) :: filename
character(len=:),allocatable, intent(out) :: str
character(len=50) :: temp ! 50 is just a random number for demonstration purposes
open(file...)
read(unit,*)temp
str = trim(temp)
close(file...)
end subroutine init_string Another interesting behavior is when the allocatable character in the above example is part of a derived type eg: type t_gas
character(len=:),allocatable :: name
double :: mass
etc...
end type t_gas Now assume we defined a type(t_gas)::gas and tried to read gas%name as we did in the first nonworking example then the program runs without any error but in reality name%gas remains uninitialized, you can print it and it just returns blank but NO error!! |
@smeskos I think you cannot |
I've generally just resorted to using a string type for everything, and then for |
will output |
Perfect, thank you @ivan-pi! |
I am to this day struggling how to deal with strings in modern Fortran. I would be happy to contribute a tutorial, once I learn what the best practice is.
Function accepting a string
Note: the first argument in
character(...)
islen
, so the above is equivalent tocharacter(len=*)
. I think it is ok to not specifylen
, as things are shorter then.Subroutine returning a string
Note: This automatically allocate the LHS, so
s
will get allocated to the length of the string, no white space padding.Question 1
In
fpm
, the following code:Gives a warning:
What am I doing wrong? How do I initialize an empty string?
Question 2
How do you return a string from a function as a return value?
I will probably have more questions. These are the most pressing.
The text was updated successfully, but these errors were encountered: